SEO Egghead by Jaimie Sirovich: A blog about SEO, written for nerds, by a nerd.

Choose a Topic:

» Suggest a topic or buzz to cover; if I write about it, you'll get credit with a link in the post!

Wed
16
Aug '06

Google Robots.txt Snafu: Part III (Conclusion)

We finally have a conclusion on how exactly to interpret a robot.txt file for the edge cases mentioned here.  Someone started a WebmasterWorld thread on the subject of contention.

Indeed, according to the specification, the rules for a specific matching user agent entirely override the "User-agent: *" rules.  Therefore, any rule under "User-agent: *" that should also be applied to a specific bot must be repeated under the "User-agent:" for that specific bot.  In other words, the more specific set of directives takes precedence over the default, and only one set is applied.

Googleguy says in the thread that he " ... believes most/all search engines interpret robots.txt this way ... "  This is also consistent with my testing.

However, some recommend placing the "*" rule last just in case, because some bots may not follow this specification, and take the first match -- even if it's a "*."  Doing so achieves the intended result regardless.

These icons link to social bookmarking sites where readers can share and discover new web pages.
  • del.icio.us
  • digg
  • Furl
  • Reddit
  E-Mail This Post/Page

2 Responses to “Google Robots.txt Snafu: Part III (Conclusion)”

  1. wangarific Says:

    Hey SEO Egghead, your link to part 2 is broken. Cheers.

  2. Jaimie Sirovich Says:

    Fixed. I had accidentally flagged it private -- so I saw it, but nobody else did. Thanks.

Leave a Reply