Scraping Just Got a Lot More Dangerous

March 12, 2015 Rami Essaid

News organizations have been fighting to stay alive for years, now they have a paved road to profitability. A Federal Court just severely restricted fair use by upholding the NY Times and The AP’s claim to copyright against Meltwater, a web scraping service, that finds mentions of clients in the news. This ruling means that news organizations, and any content producer, can monetize the use of their content to 3rd parties who previously have been freely scraping content. This isn’t just about syndication of your content anymore; there are a thousand other ways that companies profit off of the content that a publisher creates and now it is their right to share in those profits.

Why is this important? For years everyone on the internet have been under the assumption that when something is posted online, it’s free and fair to use. That meant that despite all the hard work and effort that went into writing an online article, nobody respected the value of that particular article- until now. The court ruled that a web scraper that is monetizing off of someone else’s content is not entitled to fair use and is in essence “stealing.”

Wait. Isn’t Google a web scraper? Well, yes. But the difference is for a search of “The New York Times”, 56% of people see that an exert on Google clicked through, as opposed to .08% for Meltwater because Google has established a reputable reputation for correctly giving credit to articles and web content verses a lesser known site. That is the distinction that separates theft from search engines. It is a slightly blurry line but I believe it will become clearer as more organizations start enforcing their rights.

So moving forward, any online publisher can and should:

  1. Monitor their site for content scrapers by either examining their log files manually or using Distil Networks in monitor only mode
  2. Go after any infringing scrapers to protect their copyright.
  3. Set up a monetization policy and perhaps build an API to sell access to their content to scrapers that need to have continued access to this data.


Reference Article

Read the Article

About the Author

Rami Essaid

Rami Essaid is the Chief Product and Strategy Officer and Co-founder of Distil Networks, the first easy and accurate way to identify and police malicious website traffic, blocking 99.9% of bad bots without impacting legitimate users. With over 12 years in telecommunications, network security, and cloud infrastructure management, Rami continues to advise enterprise companies around the world, helping them embrace the cloud to improve their scalability and reliability while maintaining a high level of security.

Follow on Twitter More Content by Rami Essaid
Previous Article
Knowing Your Online Enemy: Website Security Webinar
Knowing Your Online Enemy: Website Security Webinar

Do you know which of your website visitors are human? Can you tell the difference between good bots and bad...

Next Article
How the Anti Bot Industry is Mirroring the Antivirus Industry – Only Faster
How the Anti Bot Industry is Mirroring the Antivirus Industry – Only Faster

Bad bots have exploded in volume. As a result, anti bot solutions have had to evolve much faster than antiv...