Nov 6

Wildcard Robots.txt Matching Is Now (Almost) Standard

Posted by Jaimie Sirovich on Nov. 6th, 2006. 4 comments — voice your opinion.

NEED A GREAT WEB SITE? NEED IT TO BE SEARCH-ENGINE-FRIENDLY?

SEO Egghead is a web development firm dedicated to creating custom, search engine optimized web site applications. We specialize in eCommerce and content management web sites that not only render information beautifully to the human, but also satisfy the "third browser" — the search engine. To us, search engines are people too. Click here to talk to us. We'd love to help!
X

By way of this post on Search Engine Roundtable, I found out that Yahoo! now supports wildcard matching for rules in Robots.txt.  This is great news. Why?  It's now de facto part of the standard — as the "big three" finally all support it!

This makes wildcard matching much more useful, as my exclusions usually apply to all search engines.  What is its utility if one of the "big three" doesn't honor such a rule?  In that case I have historically resorted to using meta tag exclusion.  Now, as long as I don't care about the second-tier search engines, I'm less inclined to do so.

I'd expect Ask.com to quickly follow suit.  This rule would work in Google, MSN, and now, Yahoo:

User-Agent: *
Disallow: /*.gif$

So I suppose Ask.com may choke for now, and that may matter.  But nothing else really matters at all in US markets. Yahoo! has more information over here.

PS: Hello from Israel!  I'll post some pictures sometime this week :)

Tell an amigo:
  • Sphinn
  • Digg
  • Reddit
  • del.icio.us
  • StumbleUpon
  • Facebook



Related posts:
Google Robots.txt Snafu (Update) Some people may know about this already, but it's worth...
Google Robots.txt Snafu: Part III (Conclusion) We finally have a conclusion on how exactly to interpret...
Google Robots.txt Snafu: Part II I decided that I would test what I think is...
Google's Borked Robots.txt I've never assumed that the "Allow:" directive was supported by...
Sitemaps Standard Still Non-Functional I was all excited about the sitemaps standard when I...




"4 Wise Comments Banged Out Somewhere On The Internet ..."


Jeremy Luebke

I knew Google supported it, but where are the docs saying MSN does? I can't seem to dig them up.

Jaimie Sirovich

Take a look at:

http://search.msn.com.sg/docs/siteowner.aspx?t=SEARCH_WEBMASTER_REF_RestrictAccessToSite.htm#b

AFAIK it works for the general case, not just extensions; but maybe I should verify. What do you think?

SEO Egghead by Jaimie Sirovich » Sitemaps Standardization Illustrates Trend

[...] A few weeks ago, Yahoo announced its support of wildcards in robots.txt.  This week, Google, Yahoo, and MSN all agreed on a standard for sitemaps.  I don't think this a coincidence of events.  Rather, I believe it is an indication of a trend — that search engine vendors see a value in collaborating on certain issues.  Other similar examples that come to mind are NavTeQ and TeleAtlas's agreement to use a standard protocol for traffic information in navigation systems.  It's just a win-win situation for everyone. [...]



Care To Bang On The Keys ... ?

BECOME AN EGGHEAD. SUBSCRIBE TO OUR RSS FEED!

Learn to be as nerdy as we are by never missing our latest blog entries. Receive great tips, tricks, and ideas on improving your web site every day! Subscribe via our RSS Feed or use the chicklets in the sidebar.