So code4lib2009 was a whole lot of fun. The amazing thing about the conference isn’t really reflected in the program of talks. I feel like I can say that since I was one of them.

The real value is the social space and the time to talk to people you’ve seen online, throw around ideas, get background/contextual information on projects, etc. Hats off to Jean Rainwater and Birkin Diana for picking an beautifully casual and intimate hotel to hold the conference in.

It’s taken me a few days to get some perspective on all that happened. In the meantime I’ve read a few accounts that capture important aspects of the event from: Terry Reese, Jon Phipps, Jay Luker, Declan Fleming, Richard Wallis (1,2,3), Dan Chudnov, Gabe Farrell.

The Linked Data Pre-conference was quite valuable. For one it gave attendees some experience in what it means to publish data in a distributed way, and to write code to aggregate it using a attendees/FOAF experiment. Mike Giarlo aptly surmised from this that the key points for teaching beginners about linked data are that:

  1. Validators are essential
  2. You are not your FOAF

In other words:

  1. Am I doing this rdf/xml, turtle, rdfa right?
  2. ZOMG, httpRange-14!

Ian Davis presented the basics of RDF for people who are already familiar with traditional data management. Apparently Ian’s slides hit #1 for the day on SlideShare, which highlights the interest in linked data that is percolating through the Web. The pre-conf was very well attended as well.

Some folks like Jonathan Brinley and Michael Klein were able to hack on a Supybot Plugin to work with the FOAF data generated by the crawler. I also got chatting with William Denton about the potential of linked data for FRBR/RDA efforts. Unfortunately I didn’t hear about Alistair Miles’ new project on google-code for exploring the translation of traditional MARC/MODS into RDA/FRBR until after the event. Most of the other slides from presenters at the pre-conf are available from the wiki page.

I was really struck by some of the issues that Dan Chudnov raised in his talk about Caching and Proxying Linked Data right before lunch. In particular his comparison of the Linking Open Data Cloud to what libraries understand as their ready reference collection:

See p.9 of Dan’s slides

Dan explored how we need to think about the technical and administrative details of managing linked-data if linked-data is to be taken seriously by the library community. Relatedly the pre-conf gave me an opportunity to publicly apologize to Anders Söderbäck for yanking offline in such an abrupt manner, and disturbing his links from subject authority records at to Dan’s ideas for consuming library linked data and Anders and mine experience publishing library linked data gelled nicely in my brain. Similar ideas from Jon Phipps (one of the authors of Best Practice Recipes for Publishing RDF Vocabularies) have led me to believe this could be a nice little area for some research.

Prepping for the pre-conference itself was good fun, since it led me to discover a series of connections between the early development of the www and Brown University (where the conference was being held) and the history of hyperdata/text: in a nutshell it was Tim Berners-Lee’s proposal for the web -> Dynatext -> Steve DeRose -> Andy van Dam -> Hypertext Editing System -> Ted Nelson -> Doug Engelbart -> Vannevar Bush. Yeah, I guess you had to be there … or maybe that didn’t help. At any rate the slides, complete with breakdancing instructions are available.

I haven’t even started talking about the main event yet. The things I took away from the 3 days of presentations and talks, in no particular order were:

  • I want to learn more about the Author-ID effort that Geoffrey Bilder talked about
  • Stefano Mazzocchi’s keynote and Sean Hannan’s presentation convinced me that I need to understand and play with Freebase’s JavaScript application development environment Acre and the sparql-ish, query by example Metaweb Query Language (MQL). It seems like Freebase is exploring some really interesting territory in building a shared knowledge base of machine readable, human editable data, which can sit behind a seemingly infinite amount of web presentation layers.
  • Terence Ingram’s presentation, Ross Singer’s presentation about Jangle, me and Mike’s SWORD presentation, and a chat with Fedora/REST proponent Matt Zumwalt, and hearing about the Talis Platform have convinced me that real REST has got mind-share and traction in the library technology world.
  • Ian Davis’ keynote on the second day captured for me, the constant challenge it is to stay true to the roots of the web, and how important it is to stay true to them. It was really interesting to hear how he emphasized the importance of data over code, and the necessity for decentralization compared with the centralization.
  • Chatting with Jodi Schneider and William Denton and listening to their presentation made me want to understand RDA and FRBR at a practical level. This includes getting into the vocabularies that are being developed, and trying to convert some data. The history of FRBR in particular as told by Bill is also a gateway into a really fascinating history of cataloging. Also the work that Diane Hillman and Jon Phipps have been doing to enable vocabulary development like RDA/FRBR seems really important to keep abreast of.

More tidbits will probably float into my blog or into my tweets over the coming weeks, as the beer wears off, and the ideas sink in. But for now I’ll leave you with some of my favorite photos from the conference. It’s the people that makes code4lib what it is. It was great to connect up, and meet new folks in the field.


Oh and in case you missed it, the tweetstream and the other fine photos.