Heads, decks, and leads: revisited

In his essay Birth of the NewsMaster, Robin Good writes:

I have seen and heard of people subscribing to hundreds if not to thousands of feeds inside their RSS aggregators.
Is that manageable? Do these people get better and more information than everyone else?
It is not. They don't.

Information architecture is one of my abiding passions. Designing an information display that can be efficiently scanned is something I've thought a whole lot about. So I'm particularly keen to understand why some people report being overwhelmed by too much RSS input, while others say they're able to process lots of it effectively.

Yesterday, for example, Steve Gillmor told me that he's feeling overwhelmed by thousands of unread items in NetNewsWire. Yet I never feel that way. I suspect that's because I'm reading in batches of 100 (in the Radio UserLand feedreader). I scan each batch quickly. Although opinions differ as to whether or not a feed should be truncated, my stance (which I'm reversing today) has been that truncation is a useful way to achieve the effect you get when scanning the left column of the Wall Street Journal's front page. Of the 100 items, I'll typically only want to read several. I open them into new Mozilla tabs, then go back and read them. Everybody's different, but for me -- and given how newspapers work, I suspect for many others too -- it's useful to separate the acts of scanning and reading. When I'm done with the batch, I click once to delete all 100 items.

As a user of NetNewsWire Lite, I don't have access to the combined view that enables items to be processed in batch rather than individually. The example screenshot suggests that there is still a per-channel interaction required, however I suspect that when Combined View is used in conjunction with Show Aggregated New Items, you can see -- and process -- everything at once. (If I've got that wrong, I'm sure Brent will clarify.)

If Steve and I have the same batch-processing capability, why do we feel so differently about the overload problem? Maybe because it's not the same. If I'm right about NNW's Combined View / Show Aggregated New Items, the difference may boil down to this: my aggregated view delivers batches of 100, whereas Steve's delivers either small per-channel batches, or very large all-channel batches. So, in other words, I'm seeing what roughly corresponds to a Wall Street Journal news summary, whereas Steve is seeing what roughly corresponds to a 5x or 10x bigger version of that page. (If I've got that wrong, I'm sure Steve will clarify.)

Either way, the content is an awkward mixture of truncated and full items. Both modes are useful, but they serve different purposes and they mix badly. Truncation is necessary for the Wall Street Journal effect, though where and how to truncate is a tricky question that I've just now changed my mind about. And of course you need the full view at some point, so you can actually read stuff.

Currently I provide two versions of my feed: truncated and full. And the truncated feed is intelligently truncated. Using a callback that Dave Winer added to Radio UserLand a couple of years ago, I select the first HTML paragraph (<p>) element. Knowing that this will happen, I put some thought into what that element will contain when I'm writing an item. In effect, the first paragraph element is the lead, or blurb. Sometimes it's just a plain paragraph. But sometimes it will contain an image, or a quotation, when these are appropriate and useful hooks. This query, which shows the first paragraphs from all my January items, illustrates some of the variation. The fact that I can issue this query against my untruncated feed shows that my truncated feed is really not necessary. What is necessary, or at any rate useful, is the extra bit of preparation, i.e. thinking about what goes into that first HTML paragraph.

Unfortunately the effect of all my careful preparation has mostly been wasted so far. When you process large batches of feeds, some of which use intelligent truncation, some of which use dumb truncation (i.e., just grab the first 250 characters and slap on an ellipsis), and some of which use no truncation, the result is kind of a mess.

All along, I've had the idea that feedreaders should be able to smooth out these differences. If you wanted a Wall Street Journal view across all your feeds, you could get one. And if you wanted a full-content view across all your feeds, you could get that too.

Playing around with my queryable feed database today, I realized we're within shouting distance of making that happen. And I'm reversing my former stance on truncation. Here is a Wall Street Journal view of all of my feeds so far today. And here is a full-content view of all of my feeds so far today. It includes this long item I'm now writing, which shows how a mixture of truncated and untruncated content is optimal for neither scanning nor for reading.

Here are my conclusions:

Nobody needs to truncate feeds in order to enable front-page views (although some will still want to in order to drive traffic to websites).
Everybody's content should be HTML (if not XHTML).
Authors should think of the first HTML element (normally a paragraph, but could be a list or a blockquote or something else) as special: the lead, or deck, that will appear in a front-page view.
Feedreaders should XHTML-ize what they read.
Feedreaders should then offer a front-page view (e.g., just the first HTML element found in each item) as well as a full-content view.

By the way, in case it isn't obvious, the RSS/Atom controversy is irrelevant to this discussion. In both environments, the same principles could be applied in exactly the same ways, for exactly the same reasons.

Former URL: http://weblog.infoworld.com/udell/2004/02/21.html#a924