Incidentally, some scraper, here, was stealing my content and posting it, verbatim, on his site.  I never authorized this.  He was even linking back and thusly sending pings to me — I got alerted to each "citation."  Not so bright.  He also struck the content at ha.ckers.org, a site about software security with some SEO stuff as well.   That was even less bright.  I was eventually going to report the site (see below), but RSnake at ha.ckers.org had a rather amusing way to deal with the problem.

RSnake used IP cloaking to serve different content to the IP that was swiping his content.  Let's just say the different content, was, well, "different."  It gave his personal address, where he went to school and worked, and a few other not-so-nice things about his favorite activities with furry animals.

Incidentally, around the same time, someone started an interesting thread over at Cre8asiteforums about how to deal with content thieves.  I posted RSnake's approach, and it got some laughs. You don't have to be as mean, but it certainly gave me something to write about:

Content-theft can cause major damage to your rankings.  Here's why:
A well-ranked and indexed web site that automatically swipes content may actually get indexed before you.  This creates some confusion as to the original author of the said content.  I presume that search engines also apply trust factors and link-patterns to decide, but it can't help to get scraped.  It would be interesting to see if a lawsuit seeking damages from a spammer ever crops up.  I think it may eventually happen.

The few solutions at your disposal:
1) Contacting them and threatening them with some sort of action (this works, sometimes); but if it doesn't, there are more options.
2)
File a DMCA report with the search engines with some accompanying proof to get them delisted.
3) Contact the hosting company or the domain registrar and have them shut down after presenting proof. GoDaddy is notoriously "efficient" at doing so.
4) Use cloaking to present "special" content; keep in mind that this would really only work with an automated scraper.  A person would obviously notice and not post it.  His IP would likely also change.
5) In the case that he's also hotlinking images and costing you bandwidth, you can use HTTP_REFERER cloaking to serve different images based on the URL of the page the image is referred from.

You should also probably use Google/Yahoo sitemaps to get new pages on your site indexed as quickly as possible.

If you're worried about legal issues for solutions 4 and 5, serve a blank image or blank content.  Posting something "creative" would be pretty funny too, though :)

Tell an amigo:
  • Sphinn
  • Digg
  • Reddit
  • del.icio.us
  • StumbleUpon
  • Facebook



Related posts:
Google's Expired Domain Penalty and Content Theft Search engine marketers loved SnapNames.  Expired domains used to evade...
An Epidemic: Acquisition of Link-Equity Through Content-Theft It scares me slightly more than the DMCA itself that...
Finding Spammers' Hideouts RSnake of ha.ckers.org documents in this post how to conveniently...
Three approaches to the breadcrumb duplicate-content issues This is an old topic for me; one that I...
Using Syndicated Content in Moderation Syndicated content is content that is authored by another source...