The Google PC

On the Google PC, you wouldn't need third-party add-ons to index and search your local files, e-mail, and instant messages. It would just happen. The voracious spider wouldn't stop there, though. The next piece of low-hanging fruit would be the Web pages you visit. These too would be stored, indexed, and made searchable. More ambitiously, the spider would record all your screen activity along with the underlying event streams. Even more ambitiously, it would record phone conversations, convert speech to text, and index that text. Although speech-to-text is a notoriously imperfect art, even imperfect results can support useful search. [InfoWorld.com]
This column is a companion to another from a few weeks ago: Google's supercomputer. Meanwhile I've been working on a story about Longhorn, for which I had long and an extremely interesting interview with Quentin Clark, the architect of director of program management for WinFS. I'd like to transcribe the whole thing to post along with the story, when it runs, but the upshot is that Microsoft is planning more and better integration between WinFS and XML -- both in terms of data definition and query -- than I'd previously heard, which is welcome news.

It seems clear, though, that whatever can be accomplished by means of what I've come to call "managed metadata," we'll always want that Google effect to be happening in parallel. When asked about the Semantic Web and RDF at InfoWorld's 2002 CTO Forum, Sergey Brin said:

Look, putting angle brackets around things is not a technology, by itself. I'd rather make progress by having computers understand what humans write, than by forcing humans to write in ways computers can understand.
From my perspective, this isn't an either/or choice. I'd rather make progress by having computers understand what people write and by helping people to write in ways that computers can understand. What's more, I'd like to construe "writing in ways that computers can understand" as a problem for which hybrid SQL/XML technology is a solution. When managed metadata exists, or can be acquired, purely relational query will be powerful. When metadata is implicitly present, for example in XML fragments, XPath and XQuery can leverage it. The combination of relational, XML, and free-text search is the best of all worlds. As I've mentioned before, by the way, Kingsley Idehen has been demonstrating this for several years.


Former URL: http://weblog.infoworld.com/udell/2004/06/22.html#a1026