The other day a non-technical friend of mine told me how his webhost shut him off because he was using too much bandwidth. I was pretty surprised because they allocate him hundreds of GB per month and his website does not get much traffic.
He was dumbfounded. He loaded up his Google Analytics account and showed me that he was only getting about 50-100 unique visitors a day.
But his webhost said that he burned through over 200GB in less than a week!
I asked him if he used any other sort of analytical package, and he did, but they were all Javascript based.
So here is something important that people need to understand. Javascript tracking packages like Google Analytics are very narrow in their scope of what is actually going on. They can only process actions for web browsers that are actually running the Javascript.
In my friend’s case he had a 500MB video file that was being linked to from a very popular internet forum. Because his Google Analytics code wasn’t being executed he had no idea these people were stealing his bandwidth.
So how did we figure it out? We processed his raw access_log files with webalizer. Every major webhost is going to have webalizer (or some variant) available and that will show you MUCH more (in some ways) about what’s going on with your site than Google Analytics can.
We put webalizer on the log file and in a matter of minutes we were able to see that the forum was the top refer to his website… and we could also see that the file had been downloaded enough to use up all of his bandwidth.
Let me show you another example… this one a bit more practical that I guarantee affects everyone reading.
Webalizer shows me that the biggest use of bandwidth for all of shoemoney.com is from an IP from a Yahoo! address.

In fact 9 of the top 15 biggest bandwidth users for ShoeMoney.com are all Yahoo! IPs.
Interesting sidenote:
Yahoo bots use up more then 5% of the total bandwidth for ShoeMoney.com but bring in less then 1% of the traffic.
Now I am not saying you should completely rely on a log analyzer like webalizer either though. It can’t show you things like bounce rates, time spent on page, browser stats (size of window) and other vital marketing information.
I highly recommend a 3 pronged approach to basic web analytics:
Google Analytics – Great overall view of your website visitors. Can report vital marketing information and goal tracking.
Webalizer – Awesome for getting to the gut of your users.
Google Webmaster Central – Excellent tool that Google put out which shows you exactly what their Google Bot is reporting back to them. It tells you if you have broken links, non indexable content, non reachable content and tons of other great stuff.







March 8, 2010 at 6:52 am
I’m a big fan of Google Analytics just because of certain features it offers and as well the console setup. I feel it has a feel for your first time user and of course your user that is looking for key data in fields you may not know of going in as a rookie. I’ll probably have to check out Webalizer to see what great goodies it can offer me to better utlize my data. Thanks for the 3 pronger, I’ll just make sure to point it away from my eyes.
March 8, 2010 at 8:28 am
Those are great tips and a gentle reminder to watch your stats – another nice, economical real time tool is Clicky – great for blogs.
March 8, 2010 at 8:29 am
Useful information, I didnt know about Webalizer until now or the info that search engine bots use so much bandwidth.
March 8, 2010 at 8:51 am
your link to google analytics is broken, it’s missing the http://
March 8, 2010 at 9:07 am
Google Webmaster Tools are amazing! It always helps me to stumble upon an interesting niche when a longtail post comes up as a high ranking.
March 8, 2010 at 9:19 am
That’s the same approach I use. The real value of Webalizer is the ability to see bots hitting your server. Analytics doesn’t tend to pick up bots because they don’t parse Javascript. I’m a little old school, but I also like to store stats via PHP in a database, so I know total number of page loads, etc.
March 8, 2010 at 9:38 am
The moral of the story is…don’t host massive video files on non-premium hosting :.)
March 8, 2010 at 9:39 am
Don’t forget about server logs either. You’ll find some interesting data there of you dig a little bit.
March 8, 2010 at 3:22 pm
That’s basically what Webalizer does – it parses logs and builds graphs – they’re extremely detailed…
March 8, 2010 at 9:40 am
I’ve seen these screenshots on many places but I never actually knew what it really was (webalizer)
I use Google Analytics, but I realize it is not enough because most of the time my Adsense impressions and Google Analytics impressions don’t match.
This post really goes to my bookmarks right away
March 8, 2010 at 10:06 am
Hey Shoe, great article!
Let me know, ae you considering droppinmg the yahoo bot?
Also, with regards to the screenshot all I see is IP’s, do youi simply run those IP’s through a checker to see the referring website?
I just can’t seem to analyse that data for my own benefit.
All the best,
Lou Sparx
March 8, 2010 at 10:11 am
What????
March 8, 2010 at 10:16 am
I always put the video files in Amazon S3 and create a cloud front distribution. Cloudberry explorer is an excellent free tool to manage your S3 buckets and objects (files and folders) and to create cloud front distribution. You can also use CNAME using your own domain to hide amazon colud front URL.
March 8, 2010 at 10:43 am
Thanks for the heads up on Webalizer
Jeremy. Checking it out as we speak.
March 8, 2010 at 11:12 am
I totally overlooked Google Webmaster Central as a stat tool. Right now I’m using G.Analytics and will definitely take a closer look at webalizer. Thanks
March 8, 2010 at 12:55 pm
Thanks for the tips shoe. So if that video file being lined was say 5mb instead of 500mb would that have affected the bandwidth allotment much?
March 8, 2010 at 1:49 pm
Yeah you definitely have to run more than one stats package on your site.
The javascript methods are better at tracking pageviews, uniques, geo stuff, etc.
The log based ones are better at tracking system resource usage.
I think most good publishers are probably running 3 or more stats tracking systems.
It’s also good because you can identify discrepancies between the different tools and try to figure out which is more accurate.
March 8, 2010 at 2:14 pm
Google analytics is insane. It has a huge number of features. It boggles my brain to be honest.
However, the one real detractor (for me at least) is that I found that it dropped quite a significant chunk of traffic referral data.
March 8, 2010 at 2:17 pm
wow interesting I usually just use awstats to check stuff but that goes through the log files to I presume so there isn’t a need to use webalizer.
March 8, 2010 at 3:23 pm
Yeah – AWStats is basically the same thing. I think the interface looks better, but Webalizer seems to have better information.
March 8, 2010 at 2:36 pm
I think google analytics has amazing depth of functionality. The only problem that I found with it when I used to use it a lot, was that it was missing big chunks of traffic referrer data.
March 8, 2010 at 3:47 pm
Great article! Thanks for the tips shoe
March 8, 2010 at 6:20 pm
Most people take analytics stats as gospel and don’t bother checking raw log files.
March 8, 2010 at 10:11 pm
You can stop hot linking by putting this code in .htaccess file, in this example for video’s.
RewriteEngine on
RewriteCond %{HTTP_REFERER} !^$
RewriteCond %{HTTP_REFERER} !^http://(www\.)?yoursite.com(/)?.*$ [NC]
RewriteRule \.(flv|swf|png|bmp)$ – [F]
They get a 403 error if the request comes other site than yoursite.com.
March 8, 2010 at 10:18 pm
Enlightening post as usual, Shoe. I never bother to check my Webalizer, never really know what’s the use. But, now I do. So, thanks.
March 9, 2010 at 12:46 am
About that video thing; it might also be done purposely to drive the owner of the video out of business especially if he is being charged on a per bandwidth usage basis.
March 9, 2010 at 4:02 am
So in short, if your bandwidth and traffic dont match, there is some issue?
March 9, 2010 at 6:19 am
I’m getting Webalizer downloaded right now! Thanks for the info.
March 9, 2010 at 9:06 am
Woow, that means we should look at our logs to see what exactly happends. Great post!
March 9, 2010 at 9:34 am
Personally, I use free services such as statcounter.com and sitemeter.com. I find Google Analytics time consuming: I have to make too many clicks to see the data I’m after.
And the time I spend with analytics can hurt my bottem line.
March 9, 2010 at 9:35 am
Personally, I use free services such as statcounter.com and sitemeter.com. I find Google Analytics time consuming: I have to make too many clicks to see the data I’m after.
And the time I spend with analytics can hurt my bottom line.
March 9, 2010 at 12:49 pm
Yeah nice post, but how can I see Webalizer reports for all my domains at once.? Thats where Mr Google wins all the time, they make things easier.
I manage 50+ sites, going in to each site backend is a chore I only do that when the alarm bells ring.
And as someone else mentioned who in their right mind hosts videos on a low end server anyway.?
March 11, 2010 at 1:45 pm
Google Analytics hmmm ill try it!
March 14, 2010 at 4:35 pm
I thought that Google Analytics would be more comprehensive. Thank you for this artigo.Eu’ll follow your tips.
March 16, 2010 at 7:13 am
Lol I would of never figured that out yeah its always good to check out your stats, and analytics to see where your traffic is coming from. They were stealing his bandwidth huh? Good thing you guys figured it out!
March 17, 2010 at 2:37 pm
Hi, I am new to blogging, thanks for this good tips, anyway i signed up in google analytics, but somehow it doesn’t work? I have configured google analytics plugin correctly, i wonder if it happens to anyone else?
March 23, 2010 at 1:57 pm
This is really great info! Thanks for it!
I DO find GA sorely lacking, so now I have some new tools to look at.
Awesome!
March 25, 2010 at 8:14 pm
Interesting and very useful information. I don’t have a site, but I’m planning to get one pretty soon. I didn’t know Yahoo bots use up a lot of site bandwidth. Is there any other site that does that?
March 28, 2010 at 12:08 am
hey wow! I’ve never really looked at my analytics information that way before. You really opened my eyes on this one. Im gonna look at my analtyics information more carefully from now on.
March 28, 2010 at 3:01 pm
Webalizer can also be used to extract keywords that Google Analytics doesn’t see, especially from internal search engines from Niche sites. Commonly overlooked, but powerful tool when you learn how to use it.
March 29, 2010 at 7:44 am
You know, I really think my colleagues over at Artfire.com could benefit from this information, so I’m going to post a link to this entry on our forums.
Heck, the people BEHIND Artfire might even find it useful.
Thanks again for such great info!
March 31, 2010 at 8:19 pm
I thought I’d place a message here to guide you the simplest way to create money by using dating affiliate along with free methods. We were able to make 3 to 5 thousand dollars a month following the methods in this totally free guide, http://bit.ly/5kmonthguide – click to see.
April 3, 2010 at 12:35 am
Hi, shoe:
Thanks for the information.
I dont know why I can Not Verify my Google Webmaster Central . I did it according to the instruciton from Google itself.
I want to use Webalizer as well. Would you please kindly leave me a detailed url which I can download directly? I believe it could help many others who are like me—non-technical bloggers?
thanks.
April 16, 2010 at 12:56 pm
Just wanted to say that you guys should try out Reinvigorate and Woopra. I have been a beta tester for both these tracking systems..and both are very good real time systems. Reinvigorate is still in beta, but Woopra is out. Woopra is THE most detailed and innovative web tracking system I have ever seen..I highly recommend you try the free package.
David.
June 2, 2010 at 11:00 am
I am happy to have found this web page. Keep up the good postings.