oclc registry

So OCLC’s WorldCat Registry is a nice new addition to OCLCs growing list of web services. Do a search for your library and take a look at the URL: aye that’s right it’s SRU. In fact do a view source on the results page and you’ll see an SRU response in XML–the HTML is being rendered with client side XSLT.

If you drill into a particular institution you’ll see a pleasantly cool uri:

http://worldcat.org/registry/Institutions/89073

…which would serve nicely as an identifier for the Browne Popular Culture Library. The institution pages are HTML instead of XML–however there is a link to an XML representation:

http://worldcat.org/webservices/registry/content/Institutions/89073

This URL isn’t bad but it would be rather nice if the former could return XML if the Accept: header had text/xml slotted before text/html. Yeah, I did check:

  curl -I "Accept: text/xml" http://worldcat.org/registry/Institutions/89073

It’s inspiring to see OCLC going the extra mile to make their new services have web friendly machine APIs.

Update: for deeper analysis check out Pete Johnston’s WorldCat Institution Registry and Identifiers. He has some great points on the use of identifiers in the xml responses.


exhibit

If you haven’t tried Exhibit out yet the simile folks have created a truly wonderful data publishing framework which runs entirely in your browser with a bit of javascript, html and css.

The remarkable part is that it requires no backend database, but simply operates on a stream of json. If you have a couple minutes take a look at their Getting Started Tutorial which shows you how to create a exhibit of MIT related nobel laureates with a tiny bit of HTML, CSS and JavaScript.

Just as an experiment I tried pointing it at my delicious json feed for metadata. It turns out that exhibit wants json data to be a hash with a key ‘items’ that points to a list of items. In addition it also wants each item to have a ‘label’ key. I quickly reformatted the delicious json with simplejson, and got this.

A few minutes later I prodded the simile folks to see if there is a way of filtering json data on the way into exhibit so that it can be normalized…time passes (like maybe an hour) and then I hear from Johan Sundström that the latest/greatest exhibit code has this sort of filtering built in!

Tangential to the exhibit code, there has been an interesting discussion recently about how to expose exhibit content to indexing services like google. Since exhibit content is generated with pure javascript, and google (as far as we know) primarily indexes html content–the exhibit content is rendered invisible. This is a problem that digital library applications and repositories have to deal with as well, so it may be of interest.


75 minutes

The worst news so far in 2007 after the surge. Can anyone else recommend a good podcast for independent music? I’m going to suffer…


uri-templates

I’ve been playing with uri-templates a little bit at $work to help formulate clean urls for a newspaper application. The goal is to provide urls such as:

  • http://example.gov/issn/0362-4331
  • http://example.gov/issn/0362-4331/1969-05-28
  • http://example.gov/issn/0362-4331/1969-05-28/1
  • http://example.gov/issn/0362-4331/1969-05-28/1/31

I was hoping something like this would work:

  • http://example.gov/issn/{issn}/{date}/{edition}/{page}

But I’d like to indicate that the date, edition and page parameters are optional. After reading the spec and some discussion it becomes clear that there is no way to indicate that part of the path is optional. OpenSearch addresses the issue to some extent by making parameters optional with ‘?’:

  • http://example.gov/issn/{issn}/{date?}/{edition?}/{page?}

Which seems to be what I want. But there are some wrinkles such as when a page is included without a date. But perhaps these details could be application specific?

The discussion seemed to indicate that the template could be bundled with a written description of how the parameters are to be used. Or instead an additional template specification for optionality could be created which references the URI Template spec. There were also some nods towards WADL, which apparently has some richer conventions for this sort of thing.

I guess for the moment using

  • http://example.gov/issn/{issn}/{date}/{edition}/{page}

with some descriptive text will work good enough. But I think it would be useful if the uri-template draft commented on the issue somehow…since it’s bound to come up again.


oxford dictionary of national biography

It’s interesting to see that the Oxford Dictionary of National Biography has created Cool URIs for their index of notable people. So for example if you want an identifier for JRR Tolkien you can use:

http://www.oxforddnb.com/index/101031766

Alas, the full content of the biography isn’t available (unless you subscribe), but I guess some publishers still have business models to hold on to. To see all the entries you have to browse them.

I think it’s a nice simple example of how authority files can be integrated into the web as we know it. Thanks to Caroline Arms for forwarding this on to me…


March on Washington

MARCH ON WASHINGTON TO END THE WAR

Begins: Sat, 27 Jan 2007 at 11:00 AM

Ends: Sat, 27 Jan 2007 at 2:00 PM

Location:

Mall between 3rd and 7th Streets

Washington, DC 20002

USA

Link: more info

Mark your calendars, and let me know if you need a place to stay...


identifiers and authority records

Authority files are rather important for unambiguously talking about a person, place or thing. In database lingo they essentially amount to a primary key for a table. Given the time and effort libraries spend in maintaining authority records and assigning control numbers to individuals it makes sense that a URI could be assigned to an individual in such authority files. I realize this idea is nothing new, but until recently I hadn’t seen it put into practice particularly well.

I imagine this has been there all along but I just noticed that OCLC’s Linked Authority File includes PURLs for authors now. For example the following URL contains a LCCN:

http://errol.oclc.org/laf/n79-7035

When you GET this your browser is automatically redirected with an HTTP 302 to:

http://alcme.oclc.org/laf/servlet/OAIHandler?
verb=GetRecord&metadataPrefix=oai_dc&identifier=n79-7035

which you’ll notice is a OAI-PMH request to fetch a DublinCore record with the identifier n79-7035:

<oai_dc:dc 
  xmlns:oai_dc="http://www.openarchives.org/OAI/2.0/oai_dc/" 
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
  xmlns:dc="http://purl.org/dc/elements/1.1/" 
  xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/oai_dc/ 
    http://openarchives.org/OAI/2.0/oai_dc.xsd">
  <dc:creator xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
    Borges, Jorge Luis,--1899-
  </dc:creator>
  <dc:description xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
    SuaÌrezLynch, B.--nnnc
  </dc:description>
</oai_dc:dc>

So now we know who this identifier is for, and the established heading for the individual. But it gets better (or worse depending on your perspective). Since this is an OAI-PMH server you can issue a ListMetadataFormats request to see what other flavors this record might be available in. If you do you’ll find out that this record is also available as marcxml in all its unholy glory (if you follow that link your browser will use a stylesheet to turn the raw xml into something a bit more presentable). Putting aside my snideness about MARC for a moment, this is a lot of useful data being made available.

You can also search the name authority file and get relevant PURLs via a SOAP/REST service. For example the irc bot panizzi in #code4lib actually has a bit of logic that allows it do lookups in the linked authority file:

06:56 < edsu> @naf borges, jorge
06:56 < panizzi> edsu: [20 matches] [~1] Borges, Jorge Luis, 1899- 
                 <http://errol.oclc.org/laf/n79-7035>; [~2] Macedo, Jorge 
                 Borges de. <http://errol.oclc.org/laf/n82-149895>; [~3] 
                 Borges, Jorge G. (Jorge Guillermo), 1874-1938 
                 <http://errol.oclc.org/laf/n90-681877>; [~4] Sua?rez Lynch, B.                  
                 <http://errol.oclc.org/laf/n82-21644>; [~5] Borges, Jorge 
                 Wheliton Miranda <http://errol.oclc.org/laf/n92-76758>; [~6] 
                 Canido Borges, Jorge Oscar (3 more messages)

All in all it’s an impressive mix of technology, standards and practice. It is not entirely clear to me how this work relates to the Virtual International Authority File. Perhaps LAF wasn’t considered a good acronym? If you are interested in such things Thom Hickey had a really interesting talk at Access2006 which has audio available.


DemoCampDC

DemoCampDC is an adaptation of BarCamp to provide an informal mechanism for sharing technology shtuff in the DC area. If you are interested and in the DC area please add your name to the list of attendees and stay tuned.


#9

22:01 < edsu> i would try to separate them now before it's 
      too late :)
22:02 < erikhatcher> it's never too late, but i certainly want 
      to keep this clean from the start

New Years Resolution #9 - never underestimate the power of a positive attitude…


mirror

Yes it’s almost as though consumers have moved on because mainstream media has abdicated its responsibility…

hahahahaha