When I last mentioned the use of HTML directives to exclude unwanted text from indexing and search, I thought I had the solution to my problem. InfoWorld's site uses the Ultraseek engine, which supports the directives <!--startindex--> and <!--stopindex-->, so I added those to my template. But it seems I got it half-backwards. Here's what I did:
<html> <head>...</>head> <body> scripts and ancillary text <!--startindex--> main text <!--stopindex--> scripts and ancillary text </body> <html>
Here's the new layout:
<html>
<head>...</>head>
<body>
<!--stopindex-->
scripts and ancillary text
<!--startindex-->
main text
<!--stopindex-->
scripts and ancillary text
</body>
<html>
If I've got this right finally, then a query for, say, Virtuoso, should -- by tomorrow or the next day -- return a handful of items about Virtuoso rather than, as now, a long list of items in which Virtuoso appears in ancillary text. And this time, I'm determined to check the results of the experiment. If I forget, please remind me.
Of course, given that my earlier item on this topic ranks #3 on this query, I'm probably tilting at windmills here. Precise search is a topic that doesn't float many people's boats. And when you look at the above examples, you can see why. Our obsession with search rankings is all about inclusion, not exclusion. We've created a whole new breed of specialists devoted to search engine optimization, and the last thing any SEO practitioner would think of doing is disabling search at the top of a page, in order to enable it more precisely somewhere in the middle. As a result, there's unlikely ever to be demand for implementing this feature in the major search engines.
Alas, it is my lonely fate to care about bringing you all of what's relevant to your query, and none of what isn't. So I'll keep plugging away at it.
Former URL: http://weblog.infoworld.com/udell/2006/05/15.html#a1449