SEO Egghead by Jaimie Sirovich: A blog about SEO, written for nerds, by a nerd.

Choose a Topic:

» Suggest a topic or buzz to cover; if I write about it, you'll get credit with a link in the post!

Tue
22
Aug '06

Google's Borked Robots.txt

I've never assumed that the "Allow:" directive was supported by all search engine spiders.  From what I know, only Google supports it.  The draft mentions it, but that's the problem -- it's just a draft.  Officially, the directive does not exist.  Admittedly, it has been in the draft state since 1997!  I guess someone should do something about it, but nobody cares enough to do so.

Anyway, Google's robots.txt makes the assumption that all spiders support this draft directive, and uses "Allow:" under "User-agent: *."

User-agent: *
Allow: /searchhistory/
Disallow: /news?output=xhtml&
Allow: /news?output=xhtml
Disallow: /search
 

It's not such a big deal, but interesting nonetheless.  I simply wouldn't do this, but I'm not Google :)

These icons link to social bookmarking sites where readers can share and discover new web pages.
  • del.icio.us
  • digg
  • Furl
  • Reddit
  E-Mail This Post/Page

Leave a Reply