Jon Udell: Remaking the P2P Meme

Tangled in the Threads
Jon Udell, September 21, 2000
Remaking the P2P Meme

It's not just about Napster and piracy

Jon returns from the O'Reilly P2P Summit Conference with more thoughts on why peer-to-peer technology has everyone so excited, and what it's going to help us achieve.

Tim O'Reilly used to be known as an author, editor, publisher, and CEO. A couple of years ago, he added a new job title to the list: meme makeover artist. At that time, free software was an emerging meme. Insiders knew that free software powered the Internet, but nobody had articulated just how and why that was so. Outsiders, meanwhile, were mostly clueless. The notion that the Internet works because of what we now call the "open source process," rather than despite it, was counter-intuitive. It didn't have the kind of viral quality that enables a meme to propagate. It needed a makeover. In particular, the "free" in "free software" needed to be reinterpreted. The rhetoric of free software advocates had, until then, focused on political, moral, and ideological themes. Freedom meant the ability to freely explore ideas, express them in software, and share that software -- not the ability to get something for nothing. That was fine for the core community, but it allowed outsiders and the press to position free software as unreliable, and the free software movement as radically countercultural. That was exactly the wrong message for mainstream businesspeople and IT people. They needed to know how free software made economic and logistical sense .

So Tim convened a summit conference, and helped reshape the "free software" meme into a new "open source software" meme. The word "free" disappeared from the new formulation, because it was just too overloaded. But the idea of freedom remained, albeit defined in different and more pragmatic ways. Open source meant freedom from buggy software, vendor lock-in, and traditional distribution channels. And it meant freedom to collaborate across corporate boundaries, share the costs of infrastructure development, and focus on the delivery of value-added services.

The open-source meme also reconnected with the roots of the Internet. Openness wasn't a strange new idea. It was fundamental to the architecture of the Internet. It was the way things always had been. The Internet, just by virtue of existing, proved than openness works.

Time for another meme makeover

Now there's a new meme forming around the idea of peer computing. Once again, the first version is misleading and incomplete. But that hasn't stopped it from spreading this time around. Napster has put the power of peer computing onto everyone's radar screen. And it's linked peer computing to piracy. Everybody "knows" that peer computing enables superdistribution of intellectual property, and spells doom for the record industry, for Hollywood, and for publishers of all kinds.

Once again, it was time for a meme makeover. So last week, Tim convened another summit conference. He invited the developers of the headline-making projects that use peer technology to superdistribute content: Napster, Gnutella, and Freenet. But he also invited people from companies who use peer computing in quite different ways. Popular Power, for example, is a commercial system which, like SETI@home and distributed.net, regards peer computing as a way to assemble massive computing resources to solve big problems. IBM is working with Microsoft and Ariba on a project called UDDI (Universal Description, Discovery, and Integration) which aims to standardize business-to-business web services. Eazel is building a next-generation Linux desktop that connects users to a cloud of services. Jabber has created an open and standard instant-messaging system that's becoming a general-purpose way to distribute XML-formatted metadata. Groove Networks is a company founded by the creator of Lotus Notes to "help people communicate in new ways." Intel has spearheaded its own effort to assess, and help to develop, P2P technologies that it believes are essential both to the Net economy and to its own internal operations.

It's very clear that none of these projects has anything to do with piracy. It's not so clear what they have to do with one another. The meme makeover that transformed "free software" into "open source software" mainly involved the repackaging of existing, well-understood software projects into a new conceptual framework. Many of them, including bind, sendmail, Apache, and Perl, fit comfortably into a single category: legacy Internet infrastructure.

This time around, it's not so easy. P2P projects are only now emerging in the wake of Napster. They come in many different flavors, and they embody all sorts of peer relationships. Some, like Napster, look like peer-to-peer technology to the user, but really depend on centralized servers to coordinate the interactions among peers. Others, like the IBM/Microsoft/Ariba effort, define a middleware realm in which servers interact in a peer-to-peer fashion. Still others, notably Freenet, aim for a "pure" P2P model with no central point of control.

More lessons from the past

One of the first things we concluded at the summit is that the open-source meme makeover didn't exhaust all of the lessons that we can learn from the Net's legacy infrastructure. P2P is, after all, the fundamental architecture of Internet communication. TCP/IP doesn't have a concept of clients and servers, it only thinks in terms of communicating endpoints. It's true that the Internet applications layered on top of TCP/IP tend to have a client/server flavor. They come in pairs: a browser talks to a web server, a mailreader talks to a mail server, and so on. But it's also true that peering among servers is one of the oldest strategies the Internet has used to achieve its spectacular scalability and reliability.

Consider routing. In a very small TCP/IP network, any node can directly contact any other node. But at a certain scale, this becomes administratively infeasible. Instead, a node sends packets to its gateway, which is a router that belongs to a peer network. The gateway consults its network of peers to find out how to move the packets closer to the destination.

The same is true of email. There's no single all-powerful mail server that knows how to reach everyone. Rather, there's a peer-to-peer network of mail servers that cooperatively route emails to their destinations.

From this perspective, P2P doesn't look like a strange new phenomenon. It looks like an old familiar friend. Or rather, a forgotten friend. The reason we've forgotten it is that in its original state of grace, the Internet didn't just have peer networks of routers or mail servers. Every endpoint had full P2P status as well. Any pair of endpoints could open a socket and talk directly, back and forth, just between themselves.

Clay Shirky, a writer for FEED and Business 2.0 who attended the summit, points out that the Internet has fallen from that original state of grace. The vast majority of nodes that have appeared on the Internet in the last five years are not able to talk directly with other nodes. Firewalls and NATs stand in the way of P2P connections. So does dynamic IP addressing. For one or another of these reasons, sometimes both, many new nodes -- in particular, most nodes that correspond to desktop PCs -- can't be contacted directly by other nodes.

From this perspective, the breakout popularity of instant messaging and Napster is only a rediscovery of a latent capability of the Internet, which naturally enables TCP/IP applications -- and their users -- to speak directly to one another. The solution is just another kind of routing. And ironically, in this case, the routing function is centralized, and doesn't itself form a peer network. Both instant messaging (IM) and Napster rely, at least so far, on single gateways, not on networks of gateways. These gateways broker connections among nodes.

Presence management, instant intranets

The Napster connection is device-specific. It's not so much me, the user, that you want to connect to, but rather the PC that has all those juicy MP3 files on it. In the case of IM, something different is going on. You don't care if it's my notebook computer that you contact, or my desktop computer. It's me that you want. The fact that I am present, on one or another of those devices, is the interesting thing.

This capability, which has come to be called "presence management," is one of the genuinely new developments that make P2P exciting. It's so handy that businesses are finding their users flocking to IM services. That's problematic for a couple of reasons, though. Business don't like sending users to external services to do internal, and often confidential, communication. What's more, they're seeing "presence management" as a core competency. It's a fundamental business need to be able to find and contact people in realtime. Email doesn't do that, and increasingly, neither does the telephone. So IM is starting to look like more than just an interesting application. It's starting to look like a platform for a variety of applications -- things like customer service, supply-chain management -- that depend on effective presence management. Jabber, which creates an open-source platform for IM-enabled applications, is poised to exploit this opportunity.

In this case, the peers in the P2P relationship are people. Technically, the architecture is a hybrid -- it brokers connections through a centralized routing service. It's possible that this centralized architecture might turn into another kind of P2P network, for the same reason that the Internet itself had to evolve a P2P routing architecture. That could happen because IM, and the applications layered on top of it, are going to want to be able to model more complex social interactions. The distinction between networking, in the computer sense, and networking, in the social sense, is going to erode. The key point here isn't the technical architecture of P2P. It's that P2P connects people to people, in the right ways, to solve specific -- and often business-relevant -- communication problems. How P2P does that is something people shouldn't care about. In many cases, it's pointless to talk about a "pure" P2P architecture that has no central point of control. Centralization and decentralization are just tools. We'll typically use them both, in combination, to create applications that help people and groups work together more effectively.

There are, however, situations in which the "pure" architecture with no central point of control is useful. We all know how IT can become a bottleneck in an organization. You want to create an ad-hoc network to facilitate some project, but nobody in IT has the time to deploy and configure the centralized support. In these cases, it's handy to be able to use P2P in its architecturally pure mode. Freenet, for example, has raised a lot of eyebrows because it looks like an anonymous data haven built in the service of an anti-authoritarian ideological agenda. And indeed, that's just what it is. But Freenet is also just an open-source tool, and like any tool, it can be used for many different purposes. Imagine an instance of Freenet that lives only on your corporate intranet. There is, in Freenet, a developing notion of ad-hoc subspaces in which teams can collaborate on project-specific work. These subspaces will form purely by consensus, requiring no central coordination. Whether it emerges on Freenet or on other P2P platforms, the ability to form this kind of ad-hoc, project-specific intranet is, like IM, one of the killer applications for P2P.

Device peering and distributed caching

So far, we've talked about cases where the peers really are computers. But sometimes, the P in P2P really does stand for peer device. Of course, that can mean many different things. Sometimes, it means the network of peer devices that are all mine. Synchronization between a PC and a PDA is the well-known case of device peering. But the trend is for every device to become an active partner on the network. Wholesale synchronization might not always be necessary or appropriate. If my PDA knows something that my PC doesn't, and the two devices can contact one another, then the PC should be able to ask the PDA for that piece of information, just as a Napster user asks another user for a song.

Sometimes, the network of peer devices is a distributed cache. It's true that the major P2P file-sharing applications -- Napster, Gnutella, and Freenet -- have triggered a huge debate about theft of intellectual property. But it's important to note that the underlying technology of distributed caching has other, and distinctly business-relevant, uses. Consider movies. Certainly the studios don't look forward to the imminent Napsterization of movies. On the other hand, they are really interested in a better way to distribute the trailers that advertise movies. Currently, these are distributed by way of web sites, and studios absorb huge bandwidth costs in order to do that. Napster-like technology might enable studios to shift that bandwidth cost to the users who want to see those trailers. This raises an important point. As we regain our status as first-class citizens of the Net, we'll find that P2P brings new responsibilities and burdens along with empowerment.

What really matters, though, is that empowerment. It's not just the number of nodes in a network that make the network powerful. It's the degree to which those nodes actively participate in the network. How can billions of people on the Net, and billions of devices, all be active partners? It's going to be a huge challenge. P2P is a set of architectures, technologies, and strategies that will help us meet that challenge.

Jon Udell (http://udell.roninhouse.com/) was BYTE Magazine's executive editor for new media, the architect of the original www.byte.com, and author of BYTE's Web Project column. He's now an independent Web/Internet consultant, and is the author of Practical Internet Groupware, from O'Reilly and Associates. His recent BYTE.com columns are archived at http://www.byte.com/index/threads

This work is licensed under a Creative Commons License.