Tangled in the ThreadsJon Udell, January 12, 2000
Simple charting for the WebAlternate approaches to serving up the numbers
This week, Randy Switt posed a question about charting data for delivery on the Web:
What I have is a set of data (text only, could be formatted in any reasonable way) that is updated about once every 15 minutes. What I need to do is take this data, convert it into a chart (not sure what form right now, but I'm guessing some sort of bar chart or x-y graph) and post it to a web page in such a way that the web page is always up to date. What would be the simplest approach to do this? I figure when XML becomes more widespread this would be trivial, but right now I'm not sure where to start with it. My main web server is IIS4 on NT4, but I can put it on a secondary Linux/Apache server if need be.
Have a look at:
This library is powerful and comes with Perl and Python interfaces. Runs fine on Linux but has not been ported to NT.
Like many open-source and freeware graphics tools, this one -- Bruce Verderaime's GDCHART -- is based on Thomas Boutell's gd, an image-generating library written in C. I hadn't previously heard of GDCHART -- which is evidently very powerful! -- but had used GIFgraph, a Perl module by Martien Verbruggen which relies on GD.pm, another Perl interface to GD written by Lincoln Stein.
Since Randy's primary Web server is NT-based, I mentioned that GD.pm and GIFgraph.pm are available as pre-built Net-installable packages from the ActiveState site. That means if you have ActiveState Perl installed, you can add these modules over the Net like this:c:\perl> ppm ppm> install gd ppm> install gifgraph
What about LZW?
As soon as I said that, I got to wondering about the PNG (Portable Network Graphics) issue. Since the Unisys patent on LZW compression remains in effect until 2003 (according to the Free Software Foundation's "Why no GIFs?" page), is it kosher to use these image-making tools?
For GDCHART the answer is yes, but the reason is that it uses an older version 1.3 of gd, which employs a patent-free Run Length Encoding algorithm rather than LZW. Says the author of GDCHART in a readme file:
GDCHART uses gd1.3, which is supplied in its entirety. See accompanying text. gd1.3 does NOT use LZW. The result is larger GIF file sizes :-(
GDCHART also works with gd1.2, which employs LZW - small GIF sizes. If you have a LZW license, feel free to use GDCHART with gd1.2. It's sure to be found on the net.
If you don't have a LZW license, use gd1.2 at your own risk!
What about GD.pm and GIFGraph? There's good news on this front. The current version of GD.pm is built on gd 1.6.7, which produces PNG files, not GIF files. And the GIFGraph charting module has been ported, by Steve Bonds, to a PNG-oriented alternate version called PNGGraph.
I found PNGGraph to be a drop-in replacement for GIFGraph. Here's a snippet of code that charts activity on my website:my @chart_data = ( \@days, \@files, \@hosts, ); my $chart = new PNGgraph::lines(800,600); $chart->set_legend( 'daily hosts', 'daily files', ); $chart->set_legend_font(GD::gdMediumBoldFont); $chart->set_x_axis_font(GD::gdMediumBoldFont); $chart->set_y_axis_font(GD::gdMediumBoldFont); $chart->set_x_label_font(GD::gdMediumBoldFont); $chart->set_y_label_font(GD::gdMediumBoldFont); $chart->set_title_font(GD::gdMediumBoldFont); $chart->set( dclrs => [ qw(blue red) ] ); $chart->set( x_label => 'days', y_label => 'files, hosts', title => 'udell.roninhouse.com', y_max_value => 1500, y_tick_number => 10, x_label_skip => 1, x_labels_vertical => 1, x_label_position => .5, line_width => 3, long_ticks => 1, x_ticks => 0, legend_placement => 'RT', ); $chart->plot_to_png( "stats.png", \@chart_data );
The result is nothing fancy, just a simple (and legal!) line chart.
Finally, Alan Shutko suggested GNUplot, one of whose output formats -- PBM -- can be converted to many formats (including PNG) using Jef Poskanzer's pbmplus.
Fresh data: evaluating the tradeoffs
The discussion then turned to ways of minimizing the work required to keep Randy's chart up-to-date.
Ricardo Banffy suggests two approaches:
1) An applet on the page would request, via HTTP, a URL that would return the data (maybe in XML form or something convenient for parsing). The applet would then generate the graph.
2) A daemon or cron job running on the server would check for updates on the data file and generate a static image from the data. This image would then be part of the page. If the program is called frequently enough, you can consider the data file up-to-date. This technique is used in a couple portals where news articles are pulled from a database and TOCs are generated via the cron job and embedded in the HTML via server-side-includes
I wouldn't consider generating the image on-the-fly (per request) unless you don't expect anyone to visit the site.
Alain Touissant likes the second approach, but not the first:
You're trading computing power for bandwith. I don't think that's a good solution, since computing power is less expensive than bandwith and not everyone enables Java in their browser -- including me, since Java- enabled pages often crash Netscape.
There are, of course, caching issues to contend with, as Dave Caplinger points out:
Will the browser actually fetch the new image or display the one it's already got in local cache? What happens if a caching proxy is between the web server and the user viewing the chart?
To solve this problem, Gavin Brelstaff recommends using the NO-CACHE header, like this:<HTML><HEAD> <META HTTP-EQUIV="Pragma" CONTENT="no-cache"> <META HTTP-EQUIV="Refresh" CONTENT="900; URL=http://www.myServer.com/graphPage.html"> </HEAD> <BODY> <IMG SRC="graph.gif"> </BODY> </HTML>
But do you really need GIFS or PNGs?
An alternative approach, as both Gavin Brelstaff and Dave Caplinger pointed out, is to dispense entirely with the delivery of raster graphics. SVG (scalable vector graphics) is the forward-looking solution. This exciting new technology combines XML, CSS (Cascading Style Sheets), and a scriptable DOM (Document Object Model) with a vocabulary of 2D graphic commands. It'll be wonderful, I'm sure, but SVG is, at the moment, highly experimental, and it'll be a while before you can expect to deliver SVG-based graphics into a typical browser.
Dave Caplinger suggests, instead, a simple, here-and-now, HTML-only approach:
There's no reason you couldn't make two single-pixel GIFs (one for the bar color and one for the "blank" color) and piece together an HTML table using awk, or if you wanted to get fancy, perl. If sorting is important or if the data needs to be massaged more (or even plucked directly from a database) then perl's probably your best bet, but the idea would still be the same:
- Get the data somehow.
- Sort the data the way you want it.
- Loop through the sorted data, building the HTML table.
Dave attached a nice example of this technique to his posting.
Current disk quota usage by team
(updated every hour)
Production Volume (/dev/vx/rdsk/production/vol01:)
MB Max MB Used % Used Team 175104.00 101141.60
57.8% retail 97282.64 77483.49
79.6% pub 56320.00 43893.14
77.9% meroma 20480.00 18341.16
89.6% comm 76800.00 11460.41
14.9% book 5120.00 794.23
Total: 431106.64 MBytes (58.7% in use)
The bars aren't three-dimensional, to be sure, but what would that add here? This is an attractive and useful information display which is easy to produce and deliver. Since it's HTML, the datapoints can trivially be linked to drill-down reports to create powerful effects that would be challenging to achieve in raster space. In fact, you don't even need to use spacer GIFs -- the row template for this example looks like this:<tr> <td align=right> 175104.00</td> <td align=right> 101141.60</td> <td> <table cellpadding=0 cellspacing=0 border=0> <tr> <td bgcolor='#003366' width=115> </td> <td bgcolor='#CCCCCC' width=84> </td> </tr> </table> </td> <td align=right> 57.8%</td> <td>retail</td> </tr>
To skinny up the bars, you can wrap <font size="-n"> tags around the non-breaking spaces. Of course this pales in comparison to what we all hope SVG will soon deliver. But with Perl, HTML, and some imagination, you don't have to wait for that day to arrive to do simple but effective data visualization on the Web.
Jon Udell (http://udell.roninhouse.com/) was BYTE Magazine's executive editor for new media, the architect of the original www.byte.com, and author of BYTE's Web Project column. He's now an independent Web/Internet consultant, and is the author of Practical Internet Groupware, from O'Reilly and Associates. His recent BYTE.com columns are archived at http://www.byte.com/index/threads
This work is licensed under a Creative Commons License.