I've never assumed that the "Allow:" directive was supported by all search engine spiders.  From what I know, only Google supports it.  The draft mentions it, but that's the problem — it's just a draft.  Officially, the directive does not exist.  Admittedly, it has been in the draft state since 1997!  I guess someone should do something about it, but nobody cares enough to do so.

Anyway, Google's robots.txt makes the assumption that all spiders support this draft directive, and uses "Allow:" under "User-agent: *."

User-agent: *
Allow: /searchhistory/
Disallow: /news?output=xhtml&
Allow: /news?output=xhtml
Disallow: /search
 

It's not such a big deal, but interesting nonetheless.  I simply wouldn't do this, but I'm not Google :)

Tell an amigo:
  • Sphinn
  • Digg
  • Reddit
  • del.icio.us
  • StumbleUpon
  • Facebook



Related posts:
Google Robots.txt Snafu: Part II I decided that I would test what I think is...
Google Robots.txt Snafu (Update) Some people may know about this already, but it's worth...
Google Robots.txt Snafu: Part III (Conclusion) We finally have a conclusion on how exactly to interpret...
Wildcard Robots.txt Matching Is Now (Almost) Standard By way of this post on Search Engine Roundtable, I...
CSS Spam and Robots.txt What really stops anyone from using a CSS-based layout, throwing...