(Bad Bots) If You Build It, They Will Come

March 15, 2017 Peter Zavlaris

When it comes to websites, bad bots have a type. If your website has one or more of these four attributes: a login, payment processor, proprietary content/pricing information, or web forms—you have a bad bot problem.

Now in its 4th addition, the Distil Networks Bad Bot Report 2017 is the world’s most comprehensive study of bad bots.

Over the course of 2016, Distil Networks data science and security analyst teams reviewed hundreds of billions of bad bot requests across thousands of domains on the Distil Network from all over the world.

Some of the key discoveries are:

  • 97 out of 100 sites were hit with scrapers
  • 96 out of 100 login pages were attacked by bad bots
  • 90 out of 100 accounts (behind the login) were hit
  • 31 out of 100 sites with web forms had spammer bots.

The data also showed that over the course 2016 bad bots made up 19.9% of all web traffic, which is 6.98% increase since 2015.

If you’re one of the world’s top 10,000 sites (as ranked by Alexa.org), you’ve had a 36.65% increase in bad bot traffic since 2015. What’s more, you won’t see 75% of bad bots coming—even if you have a WAF—until it’s too late. Find out why...

In the 2017 Bad Bot Report you will learn:

  • More on the four bad bot attracting attributes and how these attributes tie in with OWASP automated threats you need to know about
  • How you can tell if bad bots are on your site
  • How bad bots are claiming to be mobile
  • How to shrink your attack surface

Here’s a preview of of Bad Bot Report:

Investigating Surprise Attacks at the Application Layer The 2017

Bad Bot Report investigates the daily surprise attacks sneaking under sensors and wreaking havoc on websites. This report is based on 2016 data collected from Distil Networks’ global network, and includes hundreds of billions of bad bot requests, anonymized over thousands of domains. The goal is to offer those on the frontlines of website security with guidance about the nature and impact of automated threats. What makes this report unique is the focus on bad bot activity at the application layer (layer 7 of the OSI model1 ). Automated application layer attacks differ from volumetric DDoS attacks, the latter manipulating lower level network protocols (see SYN flood for more detail2). Bad bots interact with applications in the same way a legitimate user would making them harder to prevent. Bots enable high-speed abuse, misuse, and attacks on websites and APIs. They enable attackers, unsavory competitors, and fraudsters to perform a wide array of malicious activities. This includes web scraping, competitive data mining, personal and financial data harvesting, bruteforce login and man-in-the-middle attacks, digital ad fraud, spam, transaction fraud, and more.

2016: The Year Bad Bots Appear Before Congress

The bad bot problem has become so rampant it’s earned its first piece of federal legislation. In an attempt to make the use of ticket scraping bots illegal, the US Congress passed the Better Online Ticket Sales Act (BOTS) in September, 2016. Its purpose was to prohibit the use and selling of software that circumvents security measures on ticket seller websites. It also prohibits selling any ticket in interstate commerce that was knowingly obtained in violation of the prohibition. While legislation is a welcome deterrent, scraping is a technical problem, and it’s difficult to legislate against those you can’t identify.3 Businesses hire out bad bot creators to price scrape competitor websites, capture findings from consumer monitoring and opinion gathering sites, and scrape contact information from consumers to whom they wish to market. Developers that create sophisticated scraping bots can earn as much as $128,000 per year. Renting out bots-for-hire can cost as little as $3.33 per hour.4 After a year of record-breaking DDoS attacks from weaponized IoT devices, congressional hearings on anti-scraping legislation, and increased bot activity—it’s clear that bad bots are here to stay.

Executive Summary of Findings

Bigger Site, Bigger Target

Bad bots made up 20% of all web traffic and are everywhere, at all times—they don’t take breaks and they don’t sleep. Even though bad bots are on all sites, larger sites were hit the hardest in 2016. Bad bots accounted for 21.83% of large website web traffic, which saw an increase of 36.43% since 2015.

Bigger Site, Bigger Target

Bad bots lie. 75.9% claimed to be the most popular browsers: Chrome, Safari, Internet Explorer, and Firefox. There was also a 42.78% year-over-year increase in bad bots claiming to be mobile browsers. Bad Bots Tell Alternative Facts Data centers were the weapon of choice for bad bots with 60.1% coming from the cloud. Amazon AWS was the top originating ISP for the third year in a row with 16.37% of all bad bot traffic—four times more than the next ISP (OVH SAS). The

Weaponization of the Data Center

Looking to scrape a competitor’s site? There may be an app for that. In 2016, 16.1% of bad bots self-reported as mobile users. Mobile ISPs accounted for 9.4% of bad bot traffic. For the first time Mobile Safari made the top five list of self-reported user agents, outranking Web Safari by 17%. This was the first time Mobile Safari outranked Web Safari in terms of bad bot traffic.

Bad Bots Go Mobile

We humans aren’t the only ones falling behind on software updates; it turns out bad bots have the same problem. One in every ten (9.45%) bad bots said they were using browser versions released before 2013. Some bad bots were reporting browser versions released as far back as 1999.

Some Bad Bots Partying Like It’s 1999

We humans aren’t the only ones falling behind on software updates; it turns out bad bots have the same problem. One in every ten (9.45%) bad bots said they were using browser versions released before 2013. Some bad bots were reporting browser versions released as far back as 1999.

USA is the Only Bot Superpower, China and Russia are the Most Blocked

More bad bots claimed to be American than all other nationalities combined. Over half of bad bots (55.4%) were hiding in plain sight within American data centers. China reached the top three for bad bots for the first time, and along with Russia they were the two most blocked countries by websites.

USA Only Fifth in Bad Bot GDP (Bad Bots per Online User)

Dominica, Netherlands, Seychelles, and Iceland all had higher bad bot GDPs than the US. The Caribbean island of Dominica had the highest number of bad bots per online user, double its nearest rival. USA was only fifth highest, behind Iceland.

If You Build It, They Will Come

When it comes to the attractiveness of a website, bad bots have a type. There are four key website features bad bots look for: proprietary content and/or pricing information, a login section, web forms, and payment processing.

97% of sites with proprietary content and/or pricing were hit by unwanted scraping. 96% of websites with login pages were hit by bad bots. 90% of websites were hit by bad bots that bypassed the login page. 31% of websites with forms were hit by spam bots.

Advanced Persistent Bots (APBs)

Today’s advanced persistent bots are sophisticated in that they can load JavaScript, hold onto cookies, and load up external resources, and persistent, in that they can randomize their IP address, headers, and user agents. In 2016, 75% of bad bots were Advanced Persistent Bots.

Telltale Signs Bots Are On Your Website

You can tell bad bots are on your site when unexpected spikes in traffic cause slowdowns and downtime. In 2016, a third (32.36%) of sites had bad bot traffic spikes of 3x the mean, and averaged 16 such spikes per year.

You’ll know bad bots are a problem when your site’s SEO rankings plummet due to price scraping and misguided ad spend as a result of skewed analytics. 93.9% of sites were visited by bad bots that trigger marketing analytics trackers and performance measuring tools.

Because of bad bots your company will have a plethora of chargebacks to resolve with your bank due to fraudulent transactions. You’ll see high numbers of failed login attempts and increased customer complaints regarding account lockouts.

Bad bots will leave fake posts, malicious backlinks, and competitor ads in your forums and customer review sections. 31.1% of sites were hit with bots spamming their web forms.

WAFs Are No Match for Advanced Persistent Bots

If you’re using a web application firewall (WAF) and are filtering out known violator user agents and IP addresses, that’s a good start. However, bad bots rotate through IPs, and cycle through user agents to evade these WAF filters.

You’ll need a way to differentiate humans from bad bots using headless browsers, browser automation tools, and man-in-the-browser malware. 52.05% percent of bad bots load and execute JavaScript—meaning they have a JavaScript engine installed.


About the Author

Peter Zavlaris

Peter Zavlaris weighs in on various topics around bot mitigation, bot defense sharing white papers, videos and other resources on the topic.

More Content by Peter Zavlaris
Previous Article
GiftGhostBot Attacks Ecommerce Gift Card Systems Across Major Online Retailers
GiftGhostBot Attacks Ecommerce Gift Card Systems Across Major Online Retailers

Distil Networks has detected a sophisticated bot attack on its network affecting nearly 1,000 customer webs...

Next Article
Vault by HashiCorp and Distil Networks
Vault by HashiCorp and Distil Networks

Read how Distil’s ops team was able to implement a Vault cluster to protect valuable data, such as database...