But even the 'good' ones can cost marketers plenty
Illustration: Shaw Nielsen
It's been estimated that anywhere from 56 percent to 62 percent of Internet traffic is automated, meaning that ads routinely go unseen by actual people at a startling rate. But what has not been revealed until now is that a significant number of bots—or computerized drivers of traffic—do not set out to pillage brands.
"Some of them are meant to defraud advertising, but the vast majority aren't," said Ben Trenda, CEO of Are You a Human, a digital security company. "It's the untold story of the good bots."
While "good bots" do not aim to rack up impressions with sophisticated digital networks like more nefarious operators, they often do by creating chargeable views that aren't flagged by even the most vigilant publishers. These bots are run by countless rogue, hard-to-locate players with a singular bad intention—typically bloggers and Internet mercenaries who seek traffic by any means, and so devise bots that employ script-based systems with names like PhantomJS, camelcamelcamel.com and Confick to steal real estate prices, weather forecasts, stock prices and other content.
...Yet even with firewalls to stop bots, they continue to vex publishers, according to experts, because bots, after being identified and blocked, are adept at quickly establishing new online identities and gaining access. Per Distil Networks' recent findings, 41 percent of bots attempting to enter a website's infrastructure are disguised as human traffic and 23 percent are immune to detection.
"It's a cat-and-mouse game," said Harvard Business School professor Ben Edelman. "What if they come from a different IP address almost every time? Sites like Weather.com and eBay are the most distinctively hard hit, but that doesn't mean everyone isn't hit just a little bit." Conversely, there are bots that publishers condone—namely Google. Publishers want the search giant's bot to scrape content so it will show up in a user's search results. Another well-regarded bot is Internet Archive, which records content for posterity.