With TransparencyCamp last weekend, news of the mandated use of feed syndication by Federal Agencies receiving funds from the Recovery Act, recent blog posts by Tim O’Reilly and the Special Libraries Association, an article in Newsweek, news of Carl Malamud’s bid to become the Public Printer of the United States (aka head of the GPO), and the W3C eGov meeting coming up next week it looks like issues related public access to government data (specifically Library of Congress bibliographic and legislative data) are hitting the mainstream media, and getting political mind-share. Exciting times.
One thing that bubbled up at code4lib2009 last week was the notion that APIs Suck. Not that web2.0 APIs are wrong or bad…they’re actually great, especially when compared to a world where no machine access to the data existed before. The point is that sometimes just having access to the raw data in the ‘lowest level format’ is the ideal. Rather than service providers trying to guess what you are trying to do with their data, and absorbing the computational responsibility of delivering it, why not make the data readily available using a protocol like HTTP? Put the data in a directory, turn on Indexes, do some sensible caching, and maybe gzip compression and let people grab it, and robots crawl it. Or maybe use something like Amazon Public Datasets. It seems like a relatively easy first step, that involves very little custom software development, and one with the ability make a huge impact.
I’m a federal employee, so I really can’t come out and formally advocate directly for political appointments. But I have to say it would great to see someone like Malamud at the helm of the GPO, since he’s been doing just this kind of work for 20 years. Exciting times.