Plagiarism in Politics and Content Scraping

August 30, 2016 Jennifer Glaeser

When Melania Trump, wife of Republican presidential nominee Donald Trump, spoke at the recent Republican National Convention, her speech drew plagiarism accusations because of its likeness to one given by First Lady Michelle Obama in 2008. Trump’s is not the first political speech to be tied to plagiarism. Russian President Vladimir Putin, Vice President Joe Biden, Senator Rand Paul, and Romanian Prime Minister Victor Ponta have all been accused of plagiarism.

A great deal of work goes into writing political addresses. Politicians hire writers—sometimes teams of them—to strategically craft speeches that will be listened to and subsequently scrutinized by international audiences.

Trump could have saved herself public scorn had she taken the step to use modern technology to scan her content for originality.

Plagiarism Detection Sites

A number of services, such as, offer online plagiarism detection. They use bots to search the Internet, looking for content matching that of a submission.

Being omnipresent on the Internet, a bot is any automated tool or script that performs a specific task in the online world. For example, uses its TurnItInbot to crawl the Internet to look for similarities. The bot also checks TurnItIn’s internal essay database for similarities. Once its search is complete, the user receives an originality report that highlights likenesses. It reveals existing matches both on the Internet and in the submitted writing. The lower the score, the less likely a writing has been plagiarized.

Comparison of Trump and Obama’s speeches

Comparison of Trump and Obama’s speeches1

Plagiarism Evidence

There is a one in a trillion chance that a sixteen-word phrase in a submission will coincidentally match another having the same length.2 And as the number of identical words in a phrase increases, so does the similarity percentage returned by Trump’s speech was revealed to have a contiguous match of 23 words-- a much higher facsimile that scored it at 46%.

Good vs. Bad Content Scraping

Illicit content scraping uses so-called bad bots to steal original website material and post it elsewhere without the consent of the originating owner. While TurnItInbot and its ilk engage in content scraping, they assist in preventing content theft—they’re not reposting information and claiming it as their own. Like Googlebot, they are therefore categorized as good bots.  

Content scraping performed by bad bots is often malicious—such activity negatively impacts owners of original content. This includes loss of revenue, lower search engine rankings, and even new marketplace competitors promoting identical offerings. All site operators should be concerned about these.

Election Day

As for election day, it’s fast approaching; there are many more candidate speeches to come. Before another politician commits a public gaffe, it would be wise for speechwriters of all parties to run their copy through a plagiarism detection site, letting good bots prevent potential embarrassment.

Previous Article
New Findings Announced in Distil’s 2016 Economics of Web Scraping Report
New Findings Announced in Distil’s 2016 Economics of Web Scraping Report

Your entire website can be crawled and used against you by competitors. View how they do it and how you can...

Next Article
Distil Stops 100% of the OWASP Top 20 Automated Threats
Distil Stops 100% of the OWASP Top 20 Automated Threats

OWASP released its Automated Threat Handbook to combat flaws in Web Application Firewalls. View the top 20 ...