Cash Volume - Turn Up The Cash Volume!

Adventures In Online Financial Freedom

Bad Google Bot Bad!

February 5th, 2008 by cashvolume

Last night I implemented a bot trap that I coded up in PHP on one of my sites. Its just a little bit of code saved into a page thats linked off the index.php of my website. I used robots.txt to disallow access for bots that follow the rules and used empty anchor text for the link so that it would be invisible to human eyes. What the bot trap basically does is record any access to the file and logs the IP by sending an email and then automatically appends to the .htacces file a deny from that IP which effectively blocks any further access coming from that IP address. Be careful this works really well I actually ended up blocking my IP address from testing the system out and had to manually edit the .htaccess file via FTP to unblock myself.

This morning I got up and first thing checked my email and amongst a bunch of spam mails I had received a friendly message from my bot trap that it had recorded and blocked access from the IP address 66.249.70.4 success I had caught my first bad bot! Now to do some investigation work. Upon further investigation I found out the IP address is owned by Google.

Why on earth would a google bot totally ignore my robots.txt file? Is there a rogue Bad Google Bot on the loose? I’m not running adsense or any adwords campaigns on the site thats running the bot trap on. Until I can find out why this particular bot is ignoring my robots.txt file it will remain blocked.

Below is the code that I whipped up for my bot trap just right click and save it into a file like badbot.php

  1. Download badbot.txt
  2. Then add a hidden link to your site by making a normal link but just leave the anchor text blank (I put mine in the footer) and save it.
  3. Next you want to add this line to your robots.txt Disallow: /badbot.php
    Here is what mine looks like…
User-agent: *
Disallow: /badbot.php

If you have any problems with the script please leave a comment.

Share this post with your friends: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Digg
  • del.icio.us
  • blogmarks
  • Bumpzee
  • feedmelinks
  • Furl
  • Reddit
  • Technorati
  • YahooMyWeb
StumbleUpon It!

Posted in tools, PHP |

Leave a Comment

Please note: Comment moderation is enabled and may delay your comment. There is no need to resubmit your comment.