The other day a non-technical friend of mine told me how his webhost shut him off because he was using too much bandwidth. I was pretty surprised because they allocate him hundreds of GB per month and his website does not get much traffic.
He was dumbfounded. He loaded up his Google Analytics account and showed me that he was only getting about 50-100 unique visitors a day.
But his webhost said that he burned through over 200GB in less than a week!
In my friend’s case he had a 500MB video file that was being linked to from a very popular internet forum. Because his Google Analytics code wasn’t being executed he had no idea these people were stealing his bandwidth.
So how did we figure it out? We processed his raw access_log files with webalizer. Every major webhost is going to have webalizer (or some variant) available and that will show you MUCH more (in some ways) about what’s going on with your site than Google Analytics can.
We put webalizer on the log file and in a matter of minutes we were able to see that the forum was the top refer to his website… and we could also see that the file had been downloaded enough to use up all of his bandwidth.
Let me show you another example… this one a bit more practical that I guarantee affects everyone reading.
Webalizer shows me that the biggest use of bandwidth for all of shoemoney.com is from an IP from a Yahoo! address.
In fact 9 of the top 15 biggest bandwidth users for ShoeMoney.com are all Yahoo! IPs.
Yahoo bots use up more then 5% of the total bandwidth for ShoeMoney.com but bring in less then 1% of the traffic.
Now I am not saying you should completely rely on a log analyzer like webalizer either though. It can’t show you things like bounce rates, time spent on page, browser stats (size of window) and other vital marketing information.
I highly recommend a 3 pronged approach to basic web analytics:
Google Analytics – Great overall view of your website visitors. Can report vital marketing information and goal tracking.
Webalizer – Awesome for getting to the gut of your users.
Google Webmaster Central – Excellent tool that Google put out which shows you exactly what their Google Bot is reporting back to them. It tells you if you have broken links, non indexable content, non reachable content and tons of other great stuff.