By way of this post on Search Engine Roundtable, I found out that Yahoo! now supports wildcard matching for rules in Robots.txt. This is great news. Why? It's now de facto part of the standard -- as the "big three" finally all support it!
This makes wildcard matching much more useful, as my exclusions usually apply to all search engines. What is its utility if one of the "big three" doesn't honor such a rule? In that case I have historically resorted to using meta tag exclusion. Now, as long as I don't care about the second-tier search engines, I'm less inclined to do so.
I'd expect Ask.com to quickly follow suit. This rule would work in Google, MSN, and now, Yahoo:
User-Agent: *
Disallow: /*.gif$
So I suppose Ask.com may choke for now, and that may matter. But nothing else really matters at all in US markets. Yahoo! has more information over here.
PS: Hello from Israel! I'll post some pictures sometime this week












November 7th, 2006 at 10:01 am
I knew Google supported it, but where are the docs saying MSN does? I can't seem to dig them up.
November 7th, 2006 at 11:16 am
Jeremy : http://search.msn.com/docs/siteowner.aspx?t=SEARCH_WEBMASTER_REF_RestrictAccessToSite.htm#B
November 7th, 2006 at 1:20 pm
Take a look at:
http://search.msn.com.sg/docs/siteowner.aspx?t=SEARCH_WEBMASTER_REF_RestrictAccessToSite.htm#b
AFAIK it works for the general case, not just extensions; but maybe I should verify. What do you think?
November 17th, 2006 at 1:01 am
[...] A few weeks ago, Yahoo announced its support of wildcards in robots.txt. This week, Google, Yahoo, and MSN all agreed on a standard for sitemaps. I don't think this a coincidence of events. Rather, I believe it is an indication of a trend — that search engine vendors see a value in collaborating on certain issues. Other similar examples that come to mind are NavTeQ and TeleAtlas's agreement to use a standard protocol for traffic information in navigation systems. It's just a win-win situation for everyone. [...]