gems…on ice


When developing and deploying RubyOnRails applications you’ve often got to think about the gem dependencies your project might have. It’s particularly useful to freeze a version of rails in your vendor directory so that your app uses that version of rails rather than a globally installed (or not installed) one. It’s easy to do this by simply invoking:

  rake freeze_gems

Which will unpack all the rails gems into vendor, and your application will magically use these instead of the globally installed rails gems.

The cool thing is that with a little bit of plugin help you can freeze your other gems in vendor as well. Simply install Rick Olson’s elegantly simple gem plugin into vendor/plugins. Then assuming you are using let’s say my oai-pmh gem you can simply:

  rake gems:freeze GEM=oai

and the gem will be unpacked in vendor, and the $LOAD_PATH for your application will automatically include the library path for the new gem. Very useful, thanks Rick!

building and ingesting

I prefer using an XML generating mini-language (elementtree, XML::Writer, REXML, Stan, etc) to actually writing raw XML. It’s just too easy for me to forget or misstype an end tag, or forget to encode strings properly–and I find all those inline strings or even here-docs make a mess of an otherwise pretty program.

Recently I wanted some code to write FOXML for ingesting digital objects into my Fedora test instance. I’m working in Ruby so REXML seemed like the best place to start…but after I finished I ran across Builder. The Builder code turned out to be somewhat shorter, much more expressive and consequently a bit easier to read (for my eyes). Here’s a quick example of how Builder’s API improves on REXML when writing this little chunk of XML:

<dc xmlns='http://purl.org/dc/elements/1.1/'>
  <title>Communication in the Presence of Noise</title>
</dc>

So here’s the REXML code:

dc = REXML::Element.new 'dc'
dc.add_attributes 'xmlns' => 'http://purl.org/dc/elements/1.1/'
title = REXML::Element.new 'title', dc
title.text 'Communication in the Presence of Noise'

and the Builder code:

x = Builder::XmlMarkup.new 
x.dc 'xmlns' => 'http://purl.org/dc/elements/1.1' do
  x.title 'Communication in the Presence of Noise'
end

So both are four lines, but look at how the Builder::XmlMarkup object infers the name of the element based on the message that is passed to it? Element attributes and content can be set when the element is created–something I wasn’t able to do w/ REXML. My favorite though is Builder’s use of blocks so that the hierarchical structure of the code directly mirrors that of the XML content!

So anyway, if you read this far you might actually like to see how a FOXML document can be built and ingested into Fedora–so hear goes building the document:

x = Builder::XmlMarkup.new :indent => 2
 
x.digitalObject 'xmlns' => 'info:fedora/fedora-system:def/foxml#' do
 
  x.objectProperties do
    x.property 'NAME' => 'http://www.w3.org/1999/02/22-rdf-syntax-ns#type',
      'VALUE' => 'FedoraObject'
    x.property 'NAME' => 'info:fedora/fedora-system:def/model#state',
      'VALUE' => 'A'
  end
 
  x.datastream 'ID' => 'DC', 'STATE' => 'A', 'CONTROL_GROUP' => 'X' do
    x.datastreamVersion 'ID' => 'DC.0', 'MIMETYPE' => 'text/xml' do
      x.xmlContent do
        x.tag! 'oai_dc:dc',
          'xmlns:oai_dc' => 'http://www.openarchives.org/OAI/2.0/oai_dc/',
          'xmlns:dc' => 'http://purl.org/dc/elements/1.1/' do
          x.tag! 'dc:title', 'Communication in the Presence of Noise'
          x.tag! 'dc:creator', 'Claude E Shannon'
          x.tag! 'dc:subject', 'Information Science'
        end
      end
    end
  end
 
end

And here’s some code to fire the foxml at Fedora in a SOAP call:

require 'Fedora-API-M-WSDLDriver'
 
# configure api_m soap client for 
host = 'http://localhost:8080/fedora/services/management'
user = 'fedoraAdmin'
pass = 'fedoraAdmin'
fedora = FedoraAPIM.new
fedora.options['protocol.http.basic_auth'] &lt;&lt; [host, user, pass]
 
fedora.ingest SOAP::SOAPBase64.new(x.to_s), 'foxml1.0', 'added test object'

Cataloging at the BBC with RubyOnRails

It’s nice to see that BBC Programme Catalogue (built with RubyOnRails and MySQL) has gone live. Here is some historical background from the about page:

The BBC has been cataloguing and indexing its programmes since the 1920s. The development of the programme catalogue has reflected the changes in the BBC and in broadcasting over the last seventy five years. For example, in the early days of broadcasting, for both Radio and TV, the majority of programmes were broadcast live and were never recorded. There was therefore little point at the time to do extensive cataloguing and indexing of material that did not exist. As you will see, the number of catalogue entries for a day in the 1990s, far exceeds the entries for a day from the 1950s.

As recording technology developed in both mediums, the requirement to keep material for re-use also grew. If material was going to be re-used, it had to be catalogued and indexed. The original records of radio programmes were handwritten into books; over time, card catalogues were developed, and from the mid-1980s onwards there have been computer based catalogues.

This experimental catalogue database holds over 900,000 entries. It is a sub-set of the data from the internal BBC database created and maintained by the BBC’s Information and Archives department. This public version is updated daily as new records are added and updated in the main catalogue. This figure is so high because, for example, each TV news story now has an individual entry in the catalogue.

Talk about sexy retrospective conversion eh? Hats off to Matt Biddulph and his colleagues. I wish I was going to RailsConf to hear more of the technical details. Actually, if you haven’t already take a look at the RailsConf program–it looks like it’s going to be a great event.