% oai2rdf.sh http://cogprints.ecs.soton.ac.uk/perl/oai2 cogprints
This will harvest the entire cogprints eprint archive and convert it on the fly to rdf which is saved in a directory called cogprints. Just in case you are wondering–yes it handles resumption tokens. In fact you can also give it date ranges to harvest, and tell it to only harvest particular metadata formats. By default it actually grabs all possible metadata formats.
As part of my day job I’ve been looking at some rdf technologies like jena and while there are lots of chunks of rdf around on the web to play with oai2rdf suddenly opens up the possibilities quite a bit.
Getting oai2rdf up and running is pretty easy. First get the oai2rdf code:
svn co http://simile.mit.edu/repository/RDFizers/oai2rdf/ oai2rdf
The magic of maven will pull down dependencies and compile the code. Then you should be able to run oai2rdf. Art Rhyno has been talking about the work the Simile folks are doing for quite a while now, and only recently have I started to see what a rich set of tools they are developing.