The Dirty Secret About Robots.txt

I lock my car doors. I think everyone does and the reason is simple – no one wants their things stolen. If you leave your car unlocked, anyone could walk by, see your cool new GPS, open your door and take it.

But what if you left a note saying “I know the doors are open, but please don’t take my GPS”? It might work for all good, law-abiding citizens – but the bad guys of the world? Chances are, it’s not going to stop them at all.

Welcome to robots.txt

Since 1994, webmasters having been creating “robots.txt” files and using them as that proverbial “please don’t steal” note. The idea behind robots.txt is simple – robots.txt contains the Robots Exclusion Protocol, which is supposed to stop bots, web crawlers and search engines from indexing areas of your website you don’t want showing up on search engines. It’s a great concept and the good bots like Googlebot, Bingbot, and Yahoo!’s Slurp bot all follow the rules and protocols you specify in your robot.txt file. They’re considered the law-abiding citizens of the Internet world and treat what your robots.txt file says as though it were law.

Robots.txt is Not Enough

The problem is that these aren’t the only bots on the web. There’s a whole other set of bots that will never even look at those robots.txt rules and just burn right through your website stealing your content, data, user info, etc. Often these bots are just people seeing something they want and taking it. In real life it’s the GPS you spent your hard earned money on. Online, it’s the content of your website that drives people to you and makes your site stand out.

Unfortunately, there are still 1,000’s of businesses who believe the robots.txt is enough to stop bots from crawling and stealing their content, data, and user info.

Find Something That Works

What if you could find a solution or process to mitigate those malicious bots as well. Imperva Bot Management protects your websites, mobile applications, and APIs from automated threats without affecting the flow of business-critical traffic. It’s a key component in Imperva’s market-leading, full stack application security solution which brings defense-in-depth to a new level.