ExpressionEngine CMS
Open, Free, Amazing

Thread

This is an archived forum and the content is probably no longer relevant, but is provided here for posterity.

The active forums are here.

Collecting Website Statistics

December 15, 2008 12:24am

Subscribe [4]
  • #1 / Dec 15, 2008 12:24am

    nzmike

    39 posts

    Hi everyone,

    I was just wondering everyone’s thoughts on the best way collecting visitor statistics.

    At the moment when a visitor visits my site I insert the current time stamp, IP address, browser, operating system and referrer into a database. I only record a given IP address once every 24 hours to try a keep the database size down. Now I’ve decided it would be nice if I could see what pages are the most popular which would mean scrapping the “record an IP address once every 24 hours” and recording everything for every single page load.

    Now obviously this would scale horribly. If I get 60,000 hits that’s 60,000 database entries.  Can anyone think of a better way of doing this? I was thinking of running cron ‘clean up’ script at the end of each week which would count up the statistics for the week and insert the summarised info into a separate table.

    I am aware of solutions like Google Analytics but what I’m looking for is something I can build straight into my application.

    Cheers,
    Mike

  • #2 / Dec 15, 2008 6:06am

    Phil Sturgeon

    2889 posts

    You have the right idea. Count up all the stats you want by week (and then again by month perhaps) then get rid of the original results.

    I’d use InnoDB for this as it handles large amounts of entries well and the improved locking will help when you run your cron.

    Also, think about if you really want to lose your page-by-page results. Perhaps in your cron job you could get a SQL or CSV dump of the table zipped and emailed to you before it wipes them?

  • #3 / Dec 30, 2008 2:23am

    jonhurlock

    11 posts

    Why not use the stats from your server’s log files? this is how many stat programs work. All you would have to do is create a simple script to parse the log files, to an output of your choice. :D

    I log the following:

    IP - Date Time - HTTP Request (Method, Page Requested, Status Code of HTTP Request) - User Agent

    Theres no need to setup a PHP script to do this is your servers already doing it 😉

  • #4 / Dec 30, 2008 4:28am

    Tom Schlick

    386 posts

    What I do for this situation is I record the ip and browser every six hours from an ip. It sets a cookie with a unique “session” Id not the php one but my auto increment Id from the db or A unique key. So the user has a session Id and the browser and os and all that is only recorded once while the cookie exists. Then for the page views all that ties that user to the view is the session id which I record with the page id and a timestamp.pretty simple scalable solution.

  • #5 / Dec 30, 2008 7:16pm

    NBrepresent

    19 posts

    Not flaming/trolling, I’m actually curious:  Why use server-side statistics at all when Google Analytics exists?

  • #6 / Dec 30, 2008 7:27pm

    Tom Glover

    493 posts

    I can see one advantage, one less js file to load each time.

  • #7 / Dec 30, 2008 8:01pm

    jonhurlock

    11 posts

    Not flaming/trolling, I’m actually curious:  Why use server-side statistics at all when Google Analytics exists?

    Waiting for the Google Analytics Script to load, can some times take along time, especially if your clients are on dialup, or in areas where net connectivity is unreliable.

    Also if you forget to install the script on the footer of each page, this includes the various error pages, your are losing potential info. 😊

  • #8 / Dec 30, 2008 8:13pm

    Tom Schlick

    386 posts

    plus if you ever actually want to do anything with the data that your collecting besides look at it you cant with GA. say if you wanted to show your top users on your site by page views or average time. with server side data you can do that. with GA you cant. with server side you are guaranteed to catch all the page views. with GA if they have JavaScript turned off you cant.

    im not putting GA down because i use it myself but i also use the server side method for internal uses and debugging. a proper mixture of the two can prove to be very very useful. believe me once you have the data you will find some very cool uses for it such as suggesting new content to your users.

.(JavaScript must be enabled to view this email address)

ExpressionEngine News!

#eecms, #events, #releases