A few weeks back I mentioned Peter Wayner's new book on Translucent Databases . Simson Garfinkel writes about it at more length in this oreillynet.com article:
For example, what if a police department needs to build a database of sexual-assault victims that lets them identify trends but hides personal information? You could use a translucent database where the first column is the hash of the victim's name, and the second column is a hash of their full address, and the third column is a hash of their block and street. You can now group incidents together by grouping entries with identical block hashes; you can see if the incidents refer to the same person by checking to see if those hashes are different.
What's great about Peter's approach is that it's really quite low-tech. Just straightforward Java code that generates SQL statements that make judicious use of MySQL's MD5 function. It's the kind of thing that's hard to think of, but easy to do.
Former URL: http://weblog.infoworld.com/udell/2002/08/03.html#a363