People have too much faith in Google – even when doing so implies a violation of the principles of computer science.  Many Google-oglers have contended that Google can find applications of JavaScript redirect cloaking with ease.  I'm not a PhD in Computer Science, but I doubt there is any easy way to find this stuff generally.  Yes, even for Google.  Clever spammers will be doing this sort of thing for awhile. 

Google will nail the lousy spammers that cut and pre-made paste scripts with a common signature, just as the Feds catch the script kiddies running precompiled exploit scripts on Windows.  Google will catch the idiots.  But it won’t even happen en masse, or to everyone.  Here’s why:

First, there is provably an infinite number of ways to do it.  I can even do this sort of proof using induction.  So there's no signature.  This makes it much harder to detect, even if it's on the page.  Any number of trivial or non-trivial changes can be made to a program to make it different.  I’m not putting the proof here, because it’s boring, and it is common sense anyway.  If Google tried to do it by signature, they would also get a lot of false positives and negatives; and both are bad. 

And executing the code won't necessarily work, either, as there are several pretty simple devious ways to confuse it.  I'm not even a black hat, and I can think of some cute ways off the top of my head.  Not to mention the fact that executing code is a massive amount of computing time, even if it uses some nice heuristics to detect which pages it would like audit more closely.  And since there's no general method to prove if/when a program terminates, I can probably even do something very mean to Mr. Google and waste lots of its time.

How about using a simple "rot13" function to "encrypt" some redirect code? Then do the reverse and append it to the document.  Or how about using AJAX script that invokes some code grabbed from a server using IP cloaking to serve differing code and have the document execute that?  Seeing the synthesis of technology to do something creative like this makes me glad I am alive.

Google is better off spending their time devaluing these spam sites based on their poor content or spam backlinks. I believe that's what's going on anyway. Analyzing copy for quality is a better and more attainable target metric in my opinion, but I'm not an expert here.  Feel free to comment. 

Here’s an example of redirection code employing rot13 (or is it ebg13?) that Google would actually have to run in order to detect:

<script language="JavaScript">
<!–

function ebg13(string) {
  var aCode = 'a'.charCodeAt();
  var zCode = 'z'.charCodeAt();
  var ACode = 'A'.charCodeAt();
  var ZCode = 'Z'.charCodeAt();
  var result = '';
  for (var c = 0; c < string.length; c++) {
    var charCode = string.charCodeAt(c);
    if (charCode >= aCode && charCode <= zCode)
      charCode = aCode + (charCode - aCode + 13) % 26;
    else if (charCode >= ACode && charCode <= ZCode)
      charCode = ACode + (charCode - ACode + 13) % 26;
    result += String.fromCharCode(charCode);
  }
  return result;
}

document.write(ebg13("<fpevcg>ybpngvba.uers='uggc://jjj.paa.pbz';</fpevcg>"));

// –>
</script>

Of course this is not nearly the end of the road as far as obfuscation.  I'll leave the AJAX example suggested above up to you.  If anyone is going to nail spam, it's Google, but even the Google PhDs have their limitations.

Update: Black_Knight, a moderator on Cre8asiteforums, pointed out that behavioral data, such as the data collected by Google from the Google Toolbar, can be used to detect things like this in his post here.  It's an interesting approach, because it can be used to leverage the (free) computational resources of all of their toolbar users to detect JavaScript redirection.  This gives a high degree of statistical confidence in the aggregate.  I must admit, it's a powerful idea; I'm just not sure Google actually does this.

Tell an amigo:
  • Sphinn
  • Digg
  • Reddit
  • del.icio.us
  • StumbleUpon
  • Facebook



Related posts:
Why I Still Have a Job in Computer Science This recent article cites that many universities are experiencing a...
How To Disable Google-Related Promoting Your Competitors So you pay your $0.10$1.00 Adwords tithe, someone wanders in...
Mattcuttsarama: 21 Great SEO Tips From Google's Matt Cutts This is a compilation of stuff Matt Cutts has said...