How to Handle RSS Scrapers Who Copy or Steal Your Content

Have you ever wondered what to do when people use your RSS feed to scrape (automatically copy) your blog content directly onto their site? How do you ensure that they don’t rank higher in the search engines for that article than you do?

Matt-Cuts

Have you ever wondered what to do when people use your RSS feed to scrape (automatically copy) your blog content directly onto their site?  How do you ensure that they don’t rank higher in the search engines for that article than you do?  It’s been known to happen, but is not very hard to prevent.

This is actually not as difficult to handle as some people believe.  In this post I will give you a few simple tools and resources you can use to handle this, as well as links to posts where other bloggers have discussed the same issue and how they have resolved it.

First of all, here is a brief video by Google’s Matt Cutts where he discusses this issue and how to prevent it from causing trouble for you, and your content’s search engine rankings.

As mentioned in the video above, the most important thing is to ensure that copies of your content always include a link back to the original article where you published it.  In the case of a blog, this will usually be your original blog post where your article or other content appeared.

At this point, you may be wondering how to ensure that your scraped content always includes a link to your post.

Handling #1 – WordPress Plugin

This is quite simple with a WordPress plugin which can be found here:

This plugin makes it easy for you to add a link of text (which can include hyperlinks) which will appear at the beginning or end of every article in your feeds.  This line of text won’t appear on the original post, it will only appear in your feeds.  So, if someone publishes the content of your RSS feed on their blog, your line of text (and links) will appear with your content.

You can customize this text to always include a link to your blog and the original post.  For example, a footer for this article might read:

This post was originally published on Building from Nothing at How to Handle RSS Scrapers Who Copy or Steal Your Content.

When such a line appears at the bottom of each article which appears in your RSS feed, it’s virtually impossible for someone to automatically copy it without also including your footer – and your link.

Handling #2 – Check Your WordPress Settings

You might want to also ensure that the full content of your posts doesn’t display in your RSS feeds, but only an Excerpt.

To handle this, take the following steps.

  1. Go to your WordPress Admin Panel:
  2. Click on  “Settings” and then “Reading”
  3. Under “For each article in a feed, show” ensure that “Summary” is selected. (This is the default so is probably already set this way if you haven’t changed it.)

The above quick handlings should be enough to put your mind at ease – there may be situations when this isn’t enough, but it’s relatively rare.

Handling #3 – Search and Respond

In some cases, content is simply plagiarized, directly and blatantly.  The above techniques won’t work if someone manually copy-pastes your content onto their site, without including a link to you content.  In some cases, they will also remove your name as the author or even pose as the author themselves by adding their own name to your content.  This is plagiarism and it’s illegal.  One wonders why anyone would have such a low respect for themselves and others to stoop to such a level … but we can leave that for another day.

Here is a good tool you can use to find out if your content has been plagiarized:

This service is available for free and paid use – but even the paid uses are remarkably low in price.  At the time of writing, Premium Searches cost only $.05 each, and the minimum purchase is $5.00 (100 searches).

If you hire a writer for any of your sites, you could also use this tool in reverse, to ensure that your writer isn’t plagiarizing on anyone else’s content and adding it to your site!  A paid search on Copyscape allows you to check an entire site, up to 10,000 pages, for copied content, with one batch search.

Copyscape also provides banners which you can add to your site, in order to warn others not to plagiarize your content (which should be obvious, but some people need to be reminded).

If you do find that your content has been plagiarized and you wish to take action, you might be wondering where to start.  The following page gives you exact directions to follow:

More Posts on this Subject

If you would like to see additional posts where other bloggers have discussed the same situation and their handlings, I recommend the following:

  • Really Simple (to Steal) & Syndicate, RSS (on Honest Holly)
  • Learn to Love RSS Scrapers – I Do! (on Pot Pie Girl)

I hope this data is useful and I would love to hear your feedback. Have you had trouble with RSS Scrapers stealing your content in the past? How did you handle it?

, , , , , , , , , , , , , , , ,

29 Responses to “How to Handle RSS Scrapers Who Copy or Steal Your Content”

  1. Dave Doolin
    September 28, 2009 at 4:05 pm #

    Excellent suite of techniques, Anna, I’m using some of these already myself.

    A while back I was adding every post to Digg right after publishing. That way I could at least get my posts up on Digg before someone else put the stolen version up. Which happened once!
    .-= Dave Doolin´s last blog ..What the Heck is HTML TITLE element and why do I need one? =-.

  2. Anna
    September 28, 2009 at 4:23 pm #

    Thanks Dave – and thanks for the Digg 🙂

    Sometimes, laying things out for people simply in the beginning, instead of letting them figure it all out the first time, can save them a lot of time. I try to do that when I can on this blog.

  3. Jenny
    September 28, 2009 at 6:16 pm #

    Geez Anna, Thank You!! At times these techy things that you make sound so simple really get me I will make sure to install that plugin and pay more attention.

    I really appreciate the great advice!
    .-= Jenny ´s last blog ..Your Blog As A Community =-.

  4. Web Career Girl
    September 30, 2009 at 5:51 am #

    I generally only show the RSS summary in my WordPress options, though I do think that sometimes readers prefer to be able to see the whole post in their reader. In that case I really think that the RSS footer is a great idea. It won’t stop people stealing, but at least you still get a link.
    .-= Web Career Girl´s last blog ..HubPages On Hold =-.

  5. Jim@Canned Hunting
    October 7, 2009 at 1:10 am #

    Make sure you are linking to yourself and let them scrape away 🙂 Same goes for people who are paranoid about other people stealing images… just watermark your pictures with your URL. That way people who steal from you actually wind up marketing your site for you 🙂
    .-= Jim@Canned Hunting´s last blog ..Fair?? =-.

  6. Matt@Sydney SEO Blog
    October 9, 2009 at 1:53 am #

    Thanks for this informative roundup.

    I recently discovered that someone was scraping my blog. So I looked up the domain details, found the owner and sent him a brief e-mail politely demanding that he remove the material. But it’s still up there.

    I’ll e-mail the hosting company if he keeps doing it much longer. But I’m not going to waste much more energy on this. It’s more annoying than anything else.

    I did post a blog about it, describing what a low act it was. It had the name of the site (though not in clickable, indexable form – wasn’t going to give him the SEO benefit!). That was scraped onto his domain as well. So his own site incriminates itself as being plagiarised!

    Just shows how careless and stupid these people are – as well as being unethical.
    .-= Matt@Sydney SEO Blog´s last blog ..Duplicate content in comments can hurt SEO =-.

  7. Anna
    October 9, 2009 at 10:57 am #

    @Matt – that is so FUNNY! You are right, they don’t come across as being the brightest people around.

    The thing is, they have everything automated so there is really nobody home to notice …

  8. Stefan
    October 11, 2009 at 4:31 pm #

    I think this is a important topic since we see more and more of these sites. Thankfully most of them only scrapes the first part of our entries, but there are a few out there republishing them.

    Unfortunately there are not much to do if they really want to steal the content. I used to block their IP but it didn’t work when I began to use Feedburner. Now I simply try to add a few backlinks within the text, but a few of the sites remove all of the links before publishing.
    .-= Stefan´s last blog ..How to Automatically Back Up Your Computer =-.

  9. Anna
    October 11, 2009 at 4:41 pm #

    @Stefan – They must work harder at stealing content than they would work at creating it.

  10. Flippa Chick
    October 12, 2009 at 8:25 am #

    Great tips! In particular, I adhere to #2 without fail…and far before I was concerned with RSS scrapers. I always operated under the belief that I want you to visit my blog to read the full article, not just through a reader.

  11. Beclecto
    October 12, 2009 at 2:15 pm #

    Anna.. So happy i found this VIP ( very important post ) on Dr WordPress’ stumble page
    I immediately downloaded the RSS footer plugin and activated it. Thanks for sharing!
    .-= Beclecto´s last blog ..Hooked on Hope Catching for a Cause =-.

  12. Phaoloo
    November 9, 2009 at 2:56 am #

    Good post Anna! Hope Google will launch a better mechanism to kill these scrappers. It might be based on the time the bot crawls the content. All we know that unique content must be crawled before copied content.
    Setting a Google Alert about new post has the same title as ours, then we can check and report to Google and even their host providers.
    .-= Phaoloo´s last blog ..Top 8 Tools To Check Spelling And More =-.

  13. Ben@Service Management Software
    November 30, 2009 at 11:58 am #

    I had never heard about Copyscape. I’m going to check that out. Unfortunately I use BlogEngine.Net and I don’t think they have any plugins for this kind of thing. I think I’m going to have to get WordPress.

  14. seo specialist
    December 18, 2009 at 2:48 am #

    thanks for the informative post, i really love RSS Footer WordPress Plugin, now days many people just copy and paste the content, so this will definitely help.

  15. VitaMixLabs
    December 21, 2009 at 9:32 pm #

    we constantly have people taking content from our blog and reposting it on their sites. to be honest, it’s somewhat fun to see their response once you contact them about it. all they would really have to do is give us a good link and ask first… and we’d oblige to let them use it. either way… thanks for the great post and great blog as always.
    .-= VitaMixLabs@Quality Vitamin Manufacturer´s last blog ..Start a Vitamin Business Checklist =-.

  16. MMA
    December 22, 2009 at 10:24 am #

    I using related post plugins + feed burner, and it will generated related post links too in our RSS feed.

    Also for checking the plagiator i have googling some of paragraph in my post beside using copyscape.

  17. Anna
    December 22, 2009 at 11:33 am #

    @Vita – You’re welcome. I suspect that some of the scrapers are internet novices who get coached into scraping as if it is a good method of building up content fast. There are plenty of people offering bad advice and false data in the internet marketing field, that’s for sure.

  18. Anna
    December 22, 2009 at 11:36 am #

    @MMA – You are right, a related posts plugin can do the job as well – though I am not sure if it works when your RSS feed only supplies excerpts as opposed to full posts.

  19. Free sat nav
    December 22, 2009 at 12:08 pm #

    Thanks for this. The rss footer has been uploaded to all my smaller blogs to see how it handles.

    Thanks.

  20. Phil Coal-Scuttle
    December 22, 2009 at 1:49 pm #

    This is a great post, just learning about RSS Feeds.

    Cheers

  21. Mobile Spa
    December 23, 2009 at 4:38 pm #

    Personally I don’t think searching for text on Google works at all. It doesn’t find anything that’s similar except for your own text.

  22. Facebook Emoticons
    December 25, 2009 at 4:32 am #

    Just turn off – remove you RSS feed.
    .-= Facebook Emoticons´s last blog ..Open Letter to Facebook Emoticons =-.

  23. Anna
    December 26, 2009 at 12:03 pm #

    @Facebook emoticons – What is this based on? How much do you know about RSS feeds and what they are used for, and do you have any experience with them at all? What experience do you have with blogging or internet marketing? Removing one’s RSS feeds it like cutting oneself in the throat where marketing is concerned. Telling someone to remove their RSS feeds so that they won’t be scraped is like telling someone they should not have a refrigerator because it might get stolen.

  24. conversion analysis
    January 28, 2010 at 2:17 am #

    Is it possible to make a podcast RSS feed that won’t download a podcast until a specific date or time?

  25. Josh Kohlbach
    February 2, 2010 at 10:02 pm #

    I handle RSS thieves much the same way with an RSS footer. I use the Better Feed plugin though – you get to customize quite a lot and have access to extra post meta data. Very handy.

    Other than that I practice a pretty hands off approach – if it’s good enough to be stolen I’ll just take that as a compliment.

  26. tech money
    March 1, 2010 at 7:01 pm #

    I do not think this is a separated situation for contents stealing, as a blogger always been stolen but can not be stopped.

    Is there an ultimate way to stop this?

  27. Anna
    March 2, 2010 at 10:48 am #

    @Tech money – depends on how hard you want to work at it. It is not something I have found necessary to fight any harder than through the methods I mentioned in the post, but some circumstances are different than others.

  28. Super Jumbo Mortgage
    May 30, 2010 at 10:03 pm #

    That is some excellent information. I didn’t realize that there was a wordpress plugin to help with this.

    I appeciate your post. Thanks!

  29. Bruce Simmons (BruSimm)
    July 12, 2010 at 2:29 pm #

    Initially, these scrapers sure do seem to pump my blood pressure up… grr. I like your perspectives and resolutions. I’ve seen a few other tricks around where you can point sites to bad images and what not.

    Does anyone know (admitting I haven’t read every comment now) if the scrapers using adsense with the stolen content are violating any adsense rules? And would Google (adsense) do anything about it?

    it’s a shame that feedburner does not give options for blocking IP’s and / or websites. Thanks again for the time on this web noise!

Leave a Reply

Leave your opinion here. Please be nice. Your Email address will be kept private.