Posts Tagged ‘rdfa’

provide and enable

Wednesday, June 18th, 2008

I got a chance to meet Jennifer Rigby of the National Archives UK at the LinkedDataPlanet Conference in New York City (thanks Ian). Jennifer is the Head of IT Strategy, and told me lots of interesting stuff related to a profound shift they’ve had in their online strategies to:

Provide and Enable

So rather than pouring all their energy into making applications to visualize archival resources, the National Archives have recognized that making machine readable resources available to the public (in formats like RDF and RDFa) is really important to their core mission. In addition to providing services and data, they are trying to enable an ecosystem of innovation around their assets–or in their words:

• We will allow others to harness the power of our information, leading to a far wider range of products and services than we could provide ourselves.
• We will continue to work with commercial partners to provide online access to millions of records.

Jennifer said we can look forward to an announcement around OpenTech2008 (July 5th) about a set of important publications that are going to made available by the Archives as RDF and RDFa. In addition I heard about how they work with website data harvested by Internet Archive to create a resolver service for transient publications on the web.

Hearing how a big organization like the National Archives can come to this realization of “Provide and Enable”, and then start to execute on it was really encouraging–and inspiring. It is also refreshing to see people recognize, in writing the importance of semantic web technologies:

We have started exploring new ideas and technologies, including using RDFa for publishing the Gazettes. The way we now publish legislation has a key role to play in the further development of the semantic web.

Cyganiak on linked data, microformats and the semweb

Friday, March 14th, 2008

In case you missed it Danny Ayers has a fun interview with Richard Cyganiak who is one of the prime movers behind the Linking Open Data Project of Semantic Web Education and Outreach Group at the W3C, and authors of Cool URIs for the Semantic Web and How to Publish Linked Data on the Web. Among other things you’ll learn some details about sindice (the semantic web search engine at DERI) which indexes (using Solr!) structured data like rdf/xml, microformats (I never noticed last.fm had microformat content) and (soon) rdfa from the world wild web. More details about Sindice can be found in an earlier podcast Paul Miller did with Eyal Oren (also at DERI).

Richard’s perspective on the past and future of the semantic web is particularly refreshing. Rather than hard selling SPARQL or even RDF his attitude seems to be to try what works now, while recognizing that the technologies that make the semantic web work may very well be different in a few years. Also there’s an interesting discussion of microformats and RDF, highlighting the strengths and weaknesses of both. Plus there is a fun side story to the LOD diagram that shows the links between various open data sets.

If you’ve ever wanted to hear more about linked-data from someone in the know now is your chance. Nice questions danja!

oai-ore and the shadow web

Friday, February 22nd, 2008

The OAI-ORE meeting is coming up, and in general I’ve been really impressed with the alpha specs that have come out. It’s not clear that there’s an established vocabulary for talking about aggregated resources on the web, so the Data Model and Vocabulary documents were of particular interest to me.

One thing I didn’t quite understand, and which I think may have some significance for implementors, is some language in the Discovery document on the subject of URI conflation:

The Data Model document [ORE Model] explicitly prohibits a URI of a ReM (URI-R) ever returning anything other than a ReM. This allows multiple representations to be associated with URI-R, such as using content negotiation to return ReMs in different languages, character sets, or compression encodings. But it does not allow URI-R to return a human readable “splash page”, either by HTTP content negotiation or redirection. For example, clients MUST NOT merge with content negotiation the following URI pair that would correspond to a ReM and a “splash page” for an object:

If I’m understanding right this would prohibit using technologies like microformats, eRDF, RDFa and GRDDL in a “splash page” to represent the resource map. It seems odd to me that you can represent a resource map in Atom, but not in HTML.

To illustrate what this might look like I took a splash page off of arXiv (hope that was ok!) and marked it up with oai-ore RDFa.

Take a look. So all I did is modify the existing XHTML at arxiv.org, and I’ve been able to represent an ORE Resource Map. This seems like a relatively simple, and powerful way for existing repositories to make their aggregated resources available.

RDFa just entered Last Call, but there are already multiple implementations. Try out the GetN3 bookmarklet on the splash page, and you should see some triples come back. I ran them through the validator at w3c and got the following graph (kinda too big to include here inline).

This kind of issue seem to be at the heart of what Ian Davis refers to when he asks “Is the Semantic Web Destined to be a Shadow?“. Andy Powell and Pete Johnston have also been strong voices for integrating digital library repositories and the web–and they are also involved with the oai-ore effort. It feels like some of the oia-ore language could be loosened a bit to allow machine readable and human readable information to commingle a bit more.