Firefox history in Google Desktop Search

As many have now discovered to their disappointment, the first version of Google Desktop Search can troll visited web pages, but only those sitting in the IE cache. Those of us using Firefox or other browsers are out of luck. Well, I couldn't wait, so I dusted off an earlier proxy project and turned it into a local proxy that writes the web pages I view in Firefox out to the filesystem. Once they're exported with .html extensions, Google indexes them.

My solution is a just a quick hack. It would need some refinement to be generally useful. But it's such an obvious idea that I'm mentioning it here in case -- LazyWeb-style -- there's a better solution already available.

Everybody seems to have a different reason to care about Google's desktop search tool. For me, it's finding things I've seen on web pages. That's huge. My proxy project was aiming to do something more ambitious -- convert pages to XHTML (where possible) and enable structured search within them. But plain old fulltext search that encompasses the pages I've read -- whether I got there by way of Google, or A9, or a referral from a web page or email or word-of-mouth -- is a killer app for me.

By the way, this same approach works for any datatype that can be exported as text to the filesystem. At one time, for example, I was exporting my Outlook mailbox for use with a homegrown indexer. That collection of files was lying around, forgotten, and Google indexed them. The same method would work for a mail app not supported by Google -- or indeed for any app. Dave Winer is absolutely right to point out that open architecture is a requirement, and that it must be possible "to write a plug-in that teaches it how to index formats it doesn't understand." In theory it shouldn't be necessary to reflect content into the filesystem, as text, in order to expose it to search. But in practice that's often easy to do, and it can deliver a big payoff for not much effort.

Update: Claus Dahl pointed me to a better solution. Ken Schutte's Slogger is a Firefox extension that saves your viewed pages -- and comes with a toolbar button so you can turn the faucet on or off at will. Nice! Of course you can use any indexer on the output, not just Google's.

Update 2: Jacques Surveyer has an excellent summary of alternatives (including several free ones) to Google's desktop search.


Former URL: http://weblog.infoworld.com/udell/2004/10/15.html#a1096