Although there is google analytics, web applications often have to save statistics of visits (IP, page visited, timestamp) in order to make some business logic. Example: websites with articles (blog or newspaper) with features like “most read articles”, “articles most read from the visitors of the current article” etc…
Save visits into a MySQL memory table
We want to avoid to count more than one visit from the same user in case of accidental (or malicious) refresh. A cookie/session approach is not enough for that and a database approach is needed. We’ll use a table where we save user IP, timestamp (it will also make trivial to implement a site feature like “number of users on the site / reading the same article”).
CREATE TABLE IF NOT EXISTS `news_counter` ( `news_id` int(10) NOT NULL, `remote_addr` char(15) NOT NULL, `created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP, `timeslot` int(11) NOT NULL COMMENT 'time() / 600', UNIQUE KEY `news_id` (`news_id`,`remote_addr`,`timeslot`) ) ENGINE=MEMORY
Note: `timeslot` can be calculated with PHP ceil(time()/600) to have the same value in intervals of 10 minutes.
That value, together with the fields `news_id` and `remote_addr` (IP of the user), are used to make an UNIQUE key in order to avoid to insert a duplicate-visit record (of course the database exception must be suppressed).
Cron job to move visits to an archive table and increment page counter, using a MySQL procedure.
I’m using a MEMORY storage engine as the idea is running a cron job (every 10 minutes) to move this data to a MyISAM table (news_counter_archive) with the same structure. That will avoid having the table growing too fast and become a performance bottleneck.
That cron job can also update a INT counter on the table with articles/news. Better not to run this operation at each page visit to avoid MyISAM table locking (due to mixed read/write) and having more performances (as we don’t want the statistics immediately updated).
Here it is a MySQL (tested on v5.1.53 linux) procedure that takes the records from the memory table, move to the archive table and update the counter of the articles (`news` table)
DELIMITER // CREATE PROCEDURE update_news_views() BEGIN DECLARE var_news_id INT(11); DECLARE var_remote_addr CHAR(15); DECLARE var_created TIMESTAMP; DECLARE c CURSOR FOR SELECT `news_id`, `remote_addr`, `created` FROM `news_counter` WHERE `created` < DATE_ADD(NOW(), INTERVAL -1 MINUTE) ORDER BY `news_id`; OPEN c; BEGIN DECLARE EXIT HANDLER FOR SQLSTATE '02000' BEGIN END; LOOP FETCH c INTO var_news_id, var_remote_addr, var_created; UPDATE `news` SET `views`=`views`+1 WHERE `news_id` = var_news_id; INSERT INTO `news_counter_archive` (`news_id`, `remote_addr`, `created`) VALUES (var_news_id, var_remote_addr, var_created); END LOOP; END; DELETE FROM `news_counter` WHERE 1; END; // DELIMITER ;
using mysql (or call it with PHP):
mysql -h [host] -u [username] -p[pass] [dbname] -e "CALL update_news_views()";
Some usefult links about MySQL procedures and cursors
– Mysql triggers are a very good solution when there other application (not changeable for technical reasons or time reasons) that work with the same tables. I’d say some big drawbacks are portability (migration to non-server db is a problem) and maintainability (much better to make logic with PHP).
– When statistics do not need to be exact, another easier solution (to avoid MyISAM locks), is incrementing by N when mt_rand(1,N)==1. So 1/Nth of queries and same results with relevant big numbers.