Skip to content

APIs Suck

With TransparencyCamp last weekend, news of the mandated use of feed syndication by Federal Agencies receiving funds from the Recovery Act, recent blog posts by Tim O’Reilly and the Special Libraries Association, an article in Newsweek, news of Carl Malamud’s bid to become the Public Printer of the United States (aka head of the GPO), and the W3C eGov meeting coming up next week it looks like issues related public access to government data (specifically Library of Congress bibliographic and legislative data) are hitting the mainstream media, and getting political mind-share. Exciting times.

One thing that bubbled up at code4lib2009 last week was the notion that APIs Suck. Not that web2.0 APIs are wrong or bad…they’re actually great, especially when compared to a world where no machine access to the data existed before. The point is that sometimes just having access to the raw data in the ‘lowest level format’ is the ideal. Rather than service providers trying to guess what you are trying to do with their data, and absorbing the computational responsibility of delivering it, why not make the data readily available using a protocol like HTTP? Put the data in a directory, turn on Indexes, do some sensible caching, and maybe gzip compression and let people grab it, and robots crawl it. Or maybe use something like Amazon Public Datasets. It seems like a relatively easy first step, that involves very little custom software development, and one with the ability make a huge impact.

I’m a federal employee, so I really can’t come out and formally advocate directly for political appointments. But I have to say it would great to see someone like Malamud at the helm of the GPO, since he’s been doing just this kind of work for 20 years. Exciting times.

2 Comments

  1. I know Anders was being kind of flip when he made this remark (since immediately after it he said that Libris had APIs and that’s what they used), but I also think sentiment is a little disingenuous. APIs that suck suck. On the flipside, if all we had to work with from any organization where big dumps of data, that would probably suck even more. The notion of having to bootstrap a heap of data into some usable form, just to even see if there’s anything useful in it, seems inefficient and counter productive.

    I guess my point is, this isn’t an either/or proposition. There are plenty of organizations that provide both an API and their entire data set. Govtrack.us does this nicely, I think. You can pick and choose between entire datasets and a la carte access as your needs determine.

    I readily admit that I’m biased on this. For the last year, majority of my time has been spent designing an API intended primarily to make all of your data available for reuse. Still, it’s an API. There will be data (personal, borrower data, for example) that people will want to control. At least, I hope people will want to control.

    But just because it’s an API, doesn’t mean that it’s the same thing as Facebook’s or Worldcat’s API. An API doesn’t inherently have to be restrictive. There are just some organizations that feel less restriction on the use of their data is more disruptive than they feel comfortable with, with regard to their business model.

    That’s a totally different blog post comment, however.

    Saturday, March 7, 2009 at 10:24 am | Permalink
  2. ed wrote:

    Yeah, agreed on it not being either/or. But my main point is that it’s probably easiest to make the data available first in some raw form, and then make the shiny API later–after it’s easier to see how people are using the data.

    I guess the title of my blog post sucks, much more than APIs :-)

    Saturday, March 7, 2009 at 11:34 am | Permalink

Post a Comment

You must be logged in to post a comment.