As most of you know a few months back my site was hacked. What many people dont know is that was actually the first of 2 times the box was hacked. The first time the box was hacked I had made the mistake of making the web files on the server writeable by the web server. Again being this server (that my blog sits on) is not used for hardly any commercial activity I was a lot less security focus then something I would call “production” ready. I implemented mod_security and some other logging tools aswell as offloaded the server logs to a different server (yea the logs were owned by the apache user also).
So basically when I got owned the person found a file on my server that was web accessible which then he could execute commands on behalf of the web user. Now because the files and log files were owned by this user he could write to them and even delete them. Lucky for me this guy just wanted to put up his Turkish political statement and try to infect his virus to people. So all he did was do a search on the box for any index.* files and copied his index file to over write them. Then he also deleted all files matching *log. So it was pretty obvious how the person did it but I was not sure what file was the hole in my system. This is the point where you have to weigh catching the hacker vs running a box that has been compromised. Since I really only have blogs and a few low traffic forums running on this box I thought it would be a good chance to see what was vulnerable.
So I installed mod_security and ran it pretty hardcore. Over the next couple weeks I learned more about adjusting its rulesets to allow possibly exploitable code but log it. Nothing happened for many weeks then one morning I got a page that my box was not responding. I quickly attached to my remote server via its DRAC card (Dell Remote Access). The DRAC card lets me take control of the server as if I was sitting right infront of it. I could see the box was sitting in a “kernel panic” mode and that it had crashed. I rebooted the box remotely but kept most services down so I could investigate what had happened.
Sure enough I figured out that the hacker had been back and downloaded some files to the /tmp directory (which was world writeable). Only this time I had changed ownership of all index.* files so they could not write to them. I guess they realized that in order to take over my web server he was going to need to be a bit more aggressive so he downloaded a rootkit to my tmp directory then tried to run it but fortunately for me that made the kernel panic and the server was in a frozen crashed state.
I was able to figure this out and also exactly what file they used to execute commands on my box very quickly because it was pretty much the last thing in the weblogs before the box crashed. (yay!)
So now here is where it gets interesting…. Now that I had figured out how the person was hacking into my box I was curious how in the hell the person found the file. It was in a subdirectory that I had not used in YEARS. There was no link to it from anywhere on my site. The directory structure it was in was like … html/oldforums/oldstuff/badfile.php . How in the hell did this person find this file? Well after going through the logs greping for the ip range that hacked my box I found that the person found my site from Google! Specifically using Google code search. Now while this was interesting it still did not explain how the page was even indexed…. ohh wait I use Google Sitemaps and I had it on to index everything (the default setting) OUPS!!
Now to be honest… this is my fault. I in no way blame Google what so ever. I had old exploitable code on my server and I told sitemaps to index it so… my fault.
I have since been working with the sitemaps team and I had some suggestions to leave some files off by default (like .inc .func) or only allow common web files with extensions like .php .html .asp etc… I hope they do this cause as sitemaps gets more popular its only going to expose more idiot webmasters like me that run with the default settings.
Ok so just for shits I thought I would do some querys on Google Code Search to see what kind of exploits I could find. Now keep in mind this probably will not show your site but it will show code and versions that you might be running… so once someone locates a exploitable version of code they then could just search for “Powered By X” or whatever fingerprint you could put on the exploitable program/version.
Hmm I wonder If we could find some xss exploits…
How About some SQL Injection exploits?
hrmm I wonder how easy it is to find host,user,pass for mysql databases…. Lets try:
100 results found.
This query might be a little puzzling for those that are not Google ninjas like me so.. I will explain. Basically we are checking for anything that ends in .php extension. Then we search the file for mysql_connect. If it contains Mysql we look for the pattern of a connection string. lastly we use the minus sign to get rid of all localhost databases (cause we cant access them).
So did we find anything interesting? Well…
Lets just look at the first 10 results:
www.ubio.org/downloads/XID.TAR.gz – Unknown License – PHP
$connection = mysql_connect(“RANSOM”,”GlobalWebUser”,”goober8″) or die(“Couldn’t connect.”);
$db_name = “dwf”;
Now in this case RANSOM is probably a local box…
ohh whats this:
$f = mysql_connect(“zeus.mbl.edu”,”tns”,”");
if (empty($limit)) $limit=50;
I can post tons of other examples but I think I have made my point. Watch your logs for people coming from google code search and always make sure your running the latest version of your software.
Also keep in mind my searchers were only looking for .php files. This is a small percentage of all the different languages and filetypes out there.
Be scared. Be very scared.