It's been a while since I took a look at my own browser stats. So long that the term is really obsolete, given the rise of the RSS newsreader. We might as well just call the things that fetch web pages what they technically are: user agents. Anyway, I started by looking for a comprehensive list of user-agent signatures, and found a promising candidate at PGTS. (Got a better one? Let me know.) Their compilation of about 6600 user-agent strings seemed reasonably current. I ran yesterday's 55000 log entries for this blog through it and got this:
unclassified | 30085 | 54.350 |
MSIE | 17554 | 31.712 |
Mozilla | 3852 | 6.959 |
Safari | 1380 | 2.493 |
Netscape | 842 | 1.521 |
Opera | 611 | 1.104 |
Galeon | 433 | 0.782 |
Konqueror | 170 | 0.307 |
Python-urllib | 170 | 0.307 |
Java | 82 | 0.148 |
Powermarks | 52 | 0.094 |
Lynx | 38 | 0.069 |
Crazy Browser | 18 | 0.033 |
iCab | 15 | 0.027 |
OmniWeb | 14 | 0.025 |
PHP | 14 | 0.025 |
lwp-trivial | 13 | 0.023 |
Wget | 8 | 0.014 |
CFNetwork | 2 | 0.004 |
Download Ninja | 1 | 0.002 |
Clearly that unclassified category wants to be unpacked. So I scanned the log for user-agent names, producing a list like this:
amaya/5.1 aolbrowser/1.0 curl/7.7.1 curl/7.9.8 gazz/2.1 gnome-vfs/1.0.1 iCab/2.8 iCab/2.9 Mozilla/4.5 iCab/2.9.1
I threw away the versions, deduped, and scanned my log entries again, giving preference to the PGTS list (bolded in the tables) but then falling back to my secondary names (italicized in the tables). Of the many interesting points that could be drawn from this data, I'll just focus on one for now. Browsers whose names begin with "Mozilla" make up almost a third of what was the unclassified category. Those plus the Mozillas recognized by the PGTS list add up to about 25%, versus MSIE's 32%. Meanwhile, as I showed yesterday, Mozilla has become a platform that can support a rather interesting XML application -- a specialized information viewer, with its own built-in structured search engine -- on Windows, Mac, and Linux.
Having reached this point after long struggle, will the Mozilla project now find a sponsor worthy of its ambition? I hope so.
Here's the revised table:
MSIE | 17554 | 31.712 |
Mozilla | 11052 | 19.966 |
NetNewsWire | 4339 | 7.839 |
Mozilla | 3852 | 6.959 |
SharpReader | 2998 | 5.416 |
Radio | 2364 | 4.271 |
Safari | 1380 | 2.493 |
Feedreader | 1123 | 2.029 |
NewsGator | 1114 | 2.013 |
Wildgrape | 924 | 1.669 |
Netscape | 842 | 1.521 |
Syndirella | 673 | 1.216 |
Opera | 611 | 1.104 |
Web | 581 | 1.050 |
RssBandit | 554 | 1.001 |
Java | 479 | 0.865 |
Galeon | 433 | 0.782 |
unclassified | 377 | 0.681 |
nntp | 340 | 0.614 |
AmphetaDesk | 287 | 0.518 |
curl | 220 | 0.397 |
LWP::Simple | 218 | 0.394 |
Konqueror | 170 | 0.307 |
Python-urllib | 170 | 0.307 |
clevercactus | 150 | 0.271 |
Hep | 133 | 0.240 |
Soup | 130 | 0.235 |
gnome-vfs | 129 | 0.233 |
PHP | 107 | 0.193 |
Wget | 106 | 0.191 |
Python-urllib | 100 | 0.181 |
SwitchCrawler | 94 | 0.170 |
Genecast | 86 | 0.155 |
Java | 82 | 0.148 |
Hapax | 78 | 0.141 |
Broked | 72 | 0.130 |
Straw | 59 | 0.107 |
http://www.almaden.ibm.com/cs/crawler | 55 | 0.099 |
blagg | 54 | 0.098 |
libwww-perl | 53 | 0.096 |
Powermarks | 52 | 0.094 |
PostNuke: | 49 | 0.089 |
Syndic8 | 48 | 0.087 |
Hatena | 41 | 0.074 |
Googlebot | 39 | 0.070 |
Lynx | 38 | 0.069 |
NIF | 37 | 0.067 |
Awasu | 36 | 0.065 |
Scooter | 34 | 0.061 |
rssSearch | 33 | 0.060 |
Frontier | 31 | 0.056 |
MagpieRSS | 30 | 0.054 |
MovableType | 30 | 0.054 |
Opera | 30 | 0.054 |
Channel | 30 | 0.054 |
Aggie | 28 | 0.051 |
Zao | 28 | 0.051 |
CFMX | 24 | 0.043 |
ia_archiver | 24 | 0.043 |
spnlib | 24 | 0.043 |
KNewsTicker | 24 | 0.043 |
Edu_RSS | 24 | 0.043 |
XSA | 24 | 0.043 |
servalBlagg.py | 23 | 0.042 |
mt-rssfeed | 21 | 0.038 |
Twisted | 21 | 0.038 |
OpenTextSiteCrawler | 19 | 0.034 |
Dual | 19 | 0.034 |
Crazy Browser | 18 | 0.033 |
ScoopRDF | 16 | 0.029 |
timboBot | 16 | 0.029 |
iCab | 15 | 0.027 |
OmniWeb | 14 | 0.025 |
PHP | 14 | 0.025 |
ActiveRefresh | 14 | 0.025 |
lwp-trivial | 13 | 0.023 |
Popdexter | 12 | 0.022 |
larbin_2.6.2 | 12 | 0.022 |
QuepasaCreep | 11 | 0.020 |
FeedDemon | 11 | 0.020 |
MyHeadlines | 11 | 0.020 |
IdeaLibHttp | 10 | 0.018 |
Fresh | 9 | 0.016 |
ovidiubot | 8 | 0.014 |
RSSMirandaPlugin | 8 | 0.014 |
Browser | 8 | 0.014 |
lwp-trivial | 8 | 0.014 |
Wget | 8 | 0.014 |
effnews | 8 | 0.014 |
janes-blogosphere | 7 | 0.013 |
FAST-WebCrawler | 6 | 0.011 |
RPT-HTTPClient | 6 | 0.011 |
Microsoft | 5 | 0.009 |
FeedOnFeeds | 5 | 0.009 |
vw-http | 4 | 0.007 |
Gazette | 4 | 0.007 |
vspider | 4 | 0.007 |
eCatch | 4 | 0.007 |
synerge | 4 | 0.007 |
httpSocket | 3 | 0.005 |
3 | 0.005 | |
Feedster | 3 | 0.005 |
Plucker | 3 | 0.005 |
DMonitor | 3 | 0.005 |
MobiPocket | 2 | 0.004 |
grimp: | 2 | 0.004 |
NPBot | 2 | 0.004 |
The | 2 | 0.004 |
ColdFusion | 2 | 0.004 |
MnogoSearch | 2 | 0.004 |
ASPseek | 2 | 0.004 |
iSiloX | 2 | 0.004 |
EbiNess | 2 | 0.004 |
linkhype.com | 2 | 0.004 |
MiracleAlphaTest | 2 | 0.004 |
LinkWalker | 2 | 0.004 |
CFNetwork | 2 | 0.004 |
SURF | 1 | 0.002 |
InfoMinder | 1 | 0.002 |
PocketFeed | 1 | 0.002 |
Watchfire | 1 | 0.002 |
daypopbot | 1 | 0.002 |
htdig | 1 | 0.002 |
Blogosphere | 1 | 0.002 |
Internet | 1 | 0.002 |
Download Ninja | 1 | 0.002 |
lachesis | 1 | 0.002 |
Calzilla | 1 | 0.002 |
Openbot | 1 | 0.002 |
LinkScan | 1 | 0.002 |
FlickBot | 1 | 0.002 |
BlogBot | 1 | 0.002 |
MSProxy | 1 | 0.002 |
Former URL: http://weblog.infoworld.com/udell/2003/06/04.html#a712