Gmail lockdown in sector 4

Note: Hallelujah and mea culpa, things aren't as gloomy as the picture I painted here. I've appended an update below.
The other day I showed my old pal Rob Mitchell how I use Gmail. He was intrigued by its search, tagging, and desktop independence, but couldn't see how I'd allow myself to get locked into an unsupported service. I explained that I'm not locked in, and we went through the setup. My home server is the New Hampshire ISP I've been using for many years. My InfoWorld address redirects there, and now my Gmail address does too. Messages sent from Gmail bear my home email address, so replies go through my home server which then forwards them on to Gmail. Although I don't use Outlook any more, it continues to run on my desktop PC, fetching backup copies of all the messages that hit my primary server.

At this point in the explanation, though, I remembered the fly in the ointment, and Rob spotted it right away. Messages that I send from Gmail aren't routed through my home server, and aren't archived locally. I've always realized that, of course. One solution would have been to cc my home server on outbound messages, but that's a drag. So instead I looked for, and found, a way to archive my Gmail.

Early pioneers in the Gmail API realm included Johnvey Hwang and Adrian Holovaty. Following their trail I found libgmail, an elegant Python library that makes it a snap to do all kinds of stuff with your Gmail account, including add contacts and retrieve messages by folder or label or query.

Once I proved to my own satisfaction that I could use libgmail to mirror my Sent Mail folder, I started using Gmail in a serious way. But, though I intended to automate that mirroring process, I never got around to it. Oops.

By the time Rob reminded me that I'd screwed up, there were a lot of messages to download. And I had a pretty good idea what would happen when I tried. Sure enough, about a tenth of the way through the process, I triggered the dreaded lockdown in sector 4. As those who have been there know, the message you receive (on your primary email account) reads in part:

Our system has detected abnormal usage of your Gmail account. As a result, we have temporarily disabled access to this account.

It will take between one minute and 24 hours for you to regain access, depending on the behavior our system detected.

As infuriating as this is, I know where they're coming from. When I helped design an online book service we wrestled with the same issue: how do you distinguish acceptable interactive use from unacceptable robotic use? There's no good solution. You measure quantity, you measure rate, you look at patterns, and you draw the line somewhere, but it's arbitrary.

In my case, access was restored in about two hours. I've throttled back my archiver, and it looks like a slow scan will do the job, but before I proceed I thought I'd pose two questions. First, if somebody out there has already worked out the lockdown algorithm, can you share the parameters?

Second, and more importantly, why can't Google just let me archive my mail without all these shenanigans? To tell you the truth, I'd pay something for that feature. I've used lots of email software over the years, and for me Gmail is by far the most productive. But as Rob rightly reminded me, you can't trust any service with the only copy of your data.

Today, while looking for a way to view the mbox-format archives I've been downloading, I finally got around to trying Thunderbird, the latest incarnation of the Netscape mailer. It's a beautiful piece of work! But the vintage fat-client three-pane mailreader feels so over. I want to live in the cloud. And when I can't contact the cloud, or when Zeus decides to smite sector 4, I need a workable fallback. I'll solve this for myself, but it's silly -- not to mention inefficient -- to have to do it the hard way.

Remember rule #1 for next-generation infoware: don't lock in my data.


Update: Kenneth Bowen wrote to say, in effect, "Dude, what's the problem, POP3 via fetchmail is working fine for me." No kidding? I was always under the impression that POP clients retrieved inbound mail only, not sent mail. And that impression was reinforced when, after Google turned on POP, I tried downloading that way. No outbound messages in the download. I tried again today; same deal.

But those were attempts to download the whole archive. If I switch to Enable POP only for mail that arrives from now on, both newly-received and newly-sent messages appear in the POP stream. Kenneth Bowen pointed me to this help page which suggests that outbound messages download only once via POP. Did earlier POP runs mark some subset of earlier sent messages such that I can't fetch them again via POP?

My conclusion so far is that I'll still need to use the libgmail method to gather the old stuff, but can use POP to fetch all new messages, albeit once only in the case of sent messages. I could then move the fetched archive around if I needed it in multiple places, or else use the libgmail method to directly refresh a secondary archive.

So it's not as bad as I first thought. From my perspective, with a backlog of stuff from before POP was turned on, things are a bit gnarly. But from Rob's perspective, as a prospective new Gmail user who may not need multiple archives, the POP solution should be OK.


Former URL: http://weblog.infoworld.com/udell/2006/01/13.html#a1370