The Web Is the Cloud's API

In 1997 I attended, and spoke at, the first Perl Conference, which two years later evolved into the Open Source Convention (OSCON). For me the most memorable part of the conference was a keynote given by my friend Andrew Schulman, entitled The Web is the API. At one point Andrew showed a slide with nothing on it but a UPS tracking URL. "This is amazing," Andrew said, "every UPS package has its own home page on the web!"

As Andrew and I both discussed in our respective talks, that home page wasn't just a document that interested people could view. It was also a chunk of data that interested machines could process. In 1997 it was often possible, but never easy, to disentangle that data from its surrounding HTML markup. Fifteen years later we've made a lot of progress. Many cloud services are now switch-hitters. When a person using a browser asks them to do something, they respond with HTML for humans to read and interact with. When a computer asks -- and by computer I mean a smartphone, a desktop PC, or a virtual machine running elsewhere in the cloud -- they respond with data for that computer to process. When services switch-hit on the data side, we say that they are providing and consuming APIs (application programming interfaces). Many services provide APIs for smartphones, PCs, and cloud VMs (virtual machines) to consume. This is a very good thing. But it still isn't quite what Andrew and I had in mind, nor what Andrew meant by The Web is the API.

Today, APIs are the API. There are APIs for Amazon, Bing, Facebook, Flickr, Google, Netflix, Tripit, Twitter, and most of the other major web services you use. Given that bits and pieces of your personal cloud are stored in various services, a developer may want to create an app that interacts with more than one of them on your behalf. To do that, he or she must learn how to use each relevant API. Nowadays they tend to share a common set of conventions, but each API is its own protocol. So while there's a lot less data friction than there used to be, there's still a lot more than there needs to be.

The data behind all those APIs, of course, is sitting in databases of one kind or another. Can we imagine browsing those databases in the same standard way we browse websites? Tim Berners-Lee did. In the original proposal for the World Wide Web, back in 1990, he listed a number of "future paths" including:

A server automatically providing a hypertext view of a (for example Oracle) database, from a description of the database and a description (for example in SQL) of the view required.

That future is real today in some places. At, for example, you can browse the catalog as a switch-hitter. Starting with Genres, for example, you can proceed to B-Horror Movies, one of which is Curse of the Voodoo, and then proceed to Curse of the Voodoo's directors, the only one of which is Lindsay Shonteff. What you see when you click these links will depend on how your browser displays documents written in the Atom feed format. By default most browsers display such documents in a friendly way. When you turn off that friendly display, you'll see the raw data document instead.

If you drill all the way down to an individual item of data, though, you'll get the same result either way. Lindsay Shonteff's name, for example, has "its own home page" in Netflix's web of data. Here is its URL:$value (try it!)

In this case the mechanism for "automatically providing a hypertext view" of the Netflix catalog is the Open Data Protocol (OData). It's one way, and I think a very good way, to make The Web is the API a true statement. Since OData is a Microsoft invention, and I work for Microsoft, you are entitled to be skeptical. But if you've followed my work and interests for the past 20+ years you'll know that OData is the sort of technology I've always championed: fundamental, non-proprietary, game-changing.

Consider these two "home pages" for Curse of the Voodoo:

Netflix: Curse of the Voodoo ['5vSc') ]

eBay: Curse of the Voodoo ['190556034358') ]

The first is an item in the Netflix catalog. The second is an item on eBay. These two webs of data are navigable in the same way. So here's the synopsis from Netflix:

Synopsis: Director Lindsay Shonteff's campy B-movie thriller centers on big-game hunter Mike Stacey (Bryant Haliday), who finds himself on the wrong side of a tribe of vengeful lion worshippers after he kills a sacred lion during an African hunting expedition. When the effects of the tribe's curse begin to surface after his return to his London home, Mike realizes that his only recourse is to return to Africa and confront the witch doctor who hexed him.

['5vSc')/Synopsis/$value ]

And here's the price of the eBay item:

Price: 1.95.

['190556034358')/CurrentPrice/$value ]

In order to correlate these two pieces of information I've had to learn nothing about the Netflix or eBay APIs. I have simply navigated two webs of data, and done so in a uniform way. Whether by means of OData or some other mechanism, I'll want the webs of data that constitute my personal cloud to work the same way.