My Basic 3 Pronged Approach To Website Analytics

The other day a non-technical friend of mine told me how his webhost shut him off because he was using too much bandwidth. I was pretty surprised because they allocate him hundreds of GB per month and his website does not get much traffic.

He was dumbfounded. He loaded up his Google Analytics account and showed me that he was only getting about 50-100 unique visitors a day.

But his webhost said that he burned through over 200GB in less than a week!

I asked him if he used any other sort of analytical package, and he did, but they were all Javascript based.

So here is something important that people need to understand. Javascript tracking packages like Google Analytics are very narrow in their scope of what is actually going on. They can only process actions for web browsers that are actually running the Javascript.

In my friend’s case he had a 500MB video file that was being linked to from a very popular internet forum. Because his Google Analytics code wasn’t being executed he had no idea these people were stealing his bandwidth.

So how did we figure it out? We processed his raw access_log files with webalizer. Every major webhost is going to have webalizer (or some variant) available and that will show you MUCH more (in some ways) about what’s going on with your site than Google Analytics can.

We put webalizer on the log file and in a matter of minutes we were able to see that the forum was the top refer to his website… and we could also see that the file had been downloaded enough to use up all of his bandwidth.

Let me show you another example… this one a bit more practical that I guarantee affects everyone reading.

Webalizer shows me that the biggest use of bandwidth for all of is from an IP from a Yahoo! address.

Yahoo SUcks

In fact 9 of the top 15 biggest bandwidth users for are all Yahoo! IPs.

Interesting sidenote:

Yahoo bots use up more then 5% of the total bandwidth for but bring in less then 1% of the traffic.

Now I am not saying you should completely rely on a log analyzer like webalizer either though. It can’t show you things like bounce rates, time spent on page, browser stats (size of window) and other vital marketing information.

I highly recommend a 3 pronged approach to basic web analytics:

Google Analytics
– Great overall view of your website visitors. Can report vital marketing information and goal tracking.

Webalizer – Awesome for getting to the gut of your users.

Google Webmaster Central – Excellent tool that Google put out which shows you exactly what their Google Bot is reporting back to them. It tells you if you have broken links, non indexable content, non reachable content and tons of other great stuff.