Tangled in the ThreadsJon Udell, Sept 1, 1999
The O'Reilly Open Source Convention
It used to be called simply the Perl Conference. The first was in 1997, and the second was last year, both in San Jose, California. This year it morphed into something much bigger: a multi-track Open Source Convention featuring parallel programs for:
- open source business issues
The regular conferences on Monday and Tuesday were preceded by a weekend of tutorials. I got there in time to catch Mark-Jason Dominus' "Tricks of the Wizards". It was all about globs, ties, closures, and other stuff that -- frighteningly -- I more or less understand after years of Perl apprenticeship. What I liked most, though, was the state-machine example. The machine in question was an NNTP server, and the states modelled were those involved in the NNTP client/server command protocol. The state machine was represented as -- what else? -- a Perl hashtable. A very clean, elegant demonstration of how Perl likes to blur the boundary between code and data, and why that's useful.
On Monday I started out in the Linux keynote, given by Michael Tiemann of Cygnus Solutions. Cygnus was the first open source software company; Tiemann tells the story in this chapter of the book Open Sources: Voices from the Open Source Revolution. In his keynote, Tiemann brought his perspective to bear on the Linux phenomenon. He started, in a roundabout way, by discussing guile, a Scheme-based scripting language that Cygnus once planned to commercialize, but that -- when the Java steamroller came along -- it instead decided to release as open source.
Nowadays Guile is used to extend GIMP, and is also the official scripting language of the GNU project. How does this relate to Linux? Tiemann pointed out that tools such as emacs and gdb, each with their own special-purpose script engines, failed to exploit "Metcalf's law" which states that the value of a network increases geometrically with the number of people connected to it. The availability of Guile in the open-source realm, he argues, catalyzed the development of GIMP as a modular, scriptable tool.
In a similar manner, Tiemann argued that Linux will catalyze the next wave of computing, by providing a standard framework for modularization and innovation. What is that next wave? In his view, it's the post-PC world of handheld and embedded devices. Cygnus is betting heavily that its ability to move the GNU tools into this space will pay off. Today, says Tiemann, embedded processors outsell processors in conventional computers by 3 to 1, and that will soon be 10 to 1. The vision that Novell's Bob Frankenberg had, 5 years ago, of "pervasive computing" is -- Tiemann thinks -- now about to become real. He thinks Linux far likelier to scale down effectively then Windows. There's evidence that he's right.
I skipped out of Tiemann's keynote to catch the end of Larry Wall's "State of the Onion" address. A Larry Wall talk is a bit like a Grateful Dead conference: there's nothing else like it, but it's hard to describe why. Larry's speeches at these conferences are always thematic. The first year's talk was woven around a selection of sound effects, including some antique bits from "F Troop" (like I said, it's hard to describe). The second year's theme was simple geometric shapes -- circles, squares, trapezoids. This year's theme was chemistry, so the shapes were much more complex -- inorganic and organic molecules. (It turns out that Larry was, among many other things, a serious student of chemistry.) What's this got to do with Perl? Well, some molecules turn out to have the kind of lumpy irregularity and gluelike properties that (some of us) cherish in Perl. Some are bound together by opposing forces, in a barely-stable configuration that might remind you of the relationship between the open-source and commercial software worlds. I dunno. It all makes sense when Larry says it...
Xanadu Released as Open Source Project
The most remarkable and strange event at the conference was the appearance Ted Nelson and his team, who demonstrated and released Xanadu, the fabled hypertext system that (indirectly) inspired the World Wide Web.
What is Xanadu?
It's a more complex and sophisticated notion of hypertext than we see on the Web today. It specifies its own globally-unique addressing system, in which addresses look liked dotted IPs but are really a notation called "tumblers." The address space is infinitely growable. And a storage system based on this system works, in effect, like a giant write-once disk. It records all edits, and can backtrack through all changes.
Links go two ways in Xanadu, unlike the Web. The system knows about, and can report, inbound links to a region of a document. Outbound links can go to more than one place. Links can overlap. If you split a region that is a link into parts, and separate the parts, each part retains the property of linking to whatever the link points at.
Xanadu supports transclusion, which is inclusion by reference, such that the included stuff appears in a containing context but really is the same stuff -- with all its inbound and outbound links -- as exists in the original (that is, canonical) location.
The Xanadu architecture
It's straightforward and familiar. The server talks TCP to clients; the protocol is ASCII and typeable though a bit more cryptic than typical HTTP requests. The client can create versions, insert stuff, create links, follow links, and run a compare operation described (in the FeBe (front-end, back-end) spec.
The demo client, Pyxi (for Python Xanadu interface) uses this command set to talk to the "Green" implementation of the server. There are actually two servers released -- "Green," (circa 1979-88) written in C, and "Gold" written in a mixture of ParcPlace Smalltalk and C++ (actually, it's C++ mechanically generated from Smalltalk). "Gold" (circa 1998-95) is said to be the more advanced implementation, but not really useable for development due to the Smalltalk entanglement. So it's Green that Xanadu wants people to try out.
The backend handles all heavy lifting -- it handles transclusion, linking, and storage. The server is small and simple -- the whole Green distribution is under 500K -- and you might wonder how such a thing can possibly be claimed to handle huge quantities of data. I haven't had a chance to try -- and even the Xanadu guys say there's been no serious testing for about a decade (!) -- but there are two key points that support the notion that it could work:
- The write-once architecture. The data store is just a big file, but the system only ever needs to append because -- remember -- it is obsessively recording every change to every document. (Not coincidentally, Ted Nelson himself obsessively records everything happening around him on videotape.) So this simplifies the transactional requirements a lot.
- Tumbler arithmetic. The operation "find me all the links that point to this region" is, Nelson claims, highly efficient because this kind of search is accomplished by mathematical operations on the tumbler addresses. I'm not competent to evaluate that claim, but now that the once-guarded "secrets" are public, we'll presumably soon know for sure. In principle, this means that these kinds of search operations will not be memory-bound even for large data sets.
Does it work?
Sort of. In the demo, some things went wrong -- links that were broken and rearranged did the right thing, more or less, but were slightly damaged in ways that the Xanadu team described as being "back-end" bugs rather than problems with the demonstration front-end. Nonetheless the demo was compelling: transclusion, bidirectional linking, and extremely rich difference analysis are things that -- as long claimed -- it actually does appear to do.
How do you federate Xanadu systems? Not by qualifying tumblers with IP addresses. The whole Xanadu address space is intended to be an alternative -- and vastly larger -- space than that of IP. Once "properly installed" two Xanadu servers cooperatively manage their respective regions of that space. Does this actually work now? Well no, the team admits, the code to do that "isn't written yet."
What's it good for?
Nelson's vision is of a world in which we all own our content, and license it for transclusion into other contexts. Documents can acquire annotations by way of inbound linking. This might, or might not, be what the Web really ought to become.
In a more practical vein, Xanadu will -- I hope -- spur the development of real solutions to vexing and mundane problems like version control and collaborative editing. Nothing short of a radically different storage model and editing model will suffice, and Xanadu is those things. We all need to move beyond version control technologies that depend on filenames, directory structures, and text pattern matching. Xanadu knows about each element of a document, uniquely, down to the granularity of one byte, and tracks the operations on those elements forever. "Think of it as a disk with very unusual properties," says Ted Nelson. Indeed. I think I could use one of those.
Responses to the Xanadu report
Here are some of the many responses to this report, the original version of which I posted to my newsgroup from the conference:
David Durand:I liked your review, because you saw through to the most important thing, which is a vision of many important needs that current version control and hypertext don't meet. The ultimate question of how we implement these systems really matters less than the complete set of functional capabilities, and Xanadu has long presented a clear vision of those needs.
Amy Wohl:Ted Nelson and Xanadu is an important part of the web's heritage and of the history of how we manage documents and document collaboration. This event (making the Xanadu code Open Source) deserved trumpets and heralds. At least we can try to insure that it receives the full discussion it deserves.
Anonymous Coward:Let's hope that it takes less than 20 more years to get to the point where the servers are networkable.
Todd Blume:I think that it is best to think of Xanadu as a body of ideas expressed partially in a body of code. The best way to preserve the legacy of these ideas is study them in all of their forms, and incorporate them in the process of evolving Internet standards where they fit in. In many ways Xanadu has already captured the hearts and minds of the next generation of standards creators.
Ted Nelson wrote an article which appeared in the January 1988 issue of Byte magazine: "Managing Immense Storage -- Project Xanadu provides a model for the future of mass storage" (Page 225).
Reading this should put you in a better position to evaluate the claim. The article has a detailed description of tumblers and tumbler arithmetic.
Believe it or not, I have that issue in hardcopy, and I unboxed it recently when Lawrence Lee mentioned Xanadu on Tomalek's Realm.
As I said to Mr. Lee, it has always bugged me that people associate Nelson with hypertext, but no attention is paid to the tumbler, which I consider to be one of the slickest pieces of code created by man. Maybe Xanadu's time has come at last.
Harvey Hahn:Probably the authoritative source of information on tumblers in particular and Xanadu in general is Ted Nelson's "hypertext" book "Literary Machines" (which went through several editions from around 1979 to 1992, called "Edition 93.1")--the BYTE article is basically just a summary of some of the ideas in there. Ted Nelson has been working on the Xanadu project for some 30-40 years now. I've been following it (and have read all the published material on it) for the past 25 years or so. Nelson's "Computer Lib/Dream Machines" (published 1974, republished by Microsoft 1987) was a sort of manifesto for a lot of his ideas, many of which have subsequently come to pass.
C. Scott Ananian
It looks like extracting the features of interest from the supplied code will be a nightmare. This is what killed Netscape/Mozilla: crufty old hard-to-understand code base.
Chuck the old code and write up a comprehensive description of what the system was supposed to do, the datastructures to enable it, and the protocols for communication. Then reimplement in (your choice here of modern OO language).
Have several reimplementations. Share protocol, not implementation. See what works and what doesn't.
That's the only way to get these ideas out there, I think. Writing up an RFC-style document would be a good start.
I enjoyed your article on Xanadu, I hadn't thought about it in years. One thing you might want to mention if you do any followup is that the storage system you describe is actually implemented and in widespread use: you're basically describing the POSTGRES no-overwrite storage manager, developed at Berkeley by Mike Stonebraker, and now in widespread use in open-source land as PostgreSQL.
Jon Udell (http://udell.roninhouse.com/) was BYTE Magazine's executive editor for new media, the architect of the original www.byte.com, and author of BYTE's Web Project column. He's now an independent Web/Internet consultant, and is the author of Practical Internet Groupware, forthcoming from O'Reilly and Associates.
This work is licensed under a Creative Commons License.