If you are interested in practical ways to garden in the emerging web-of-data take a look at this draft finding that folks in the W3C Technical Architecture Group are considering. Or for a different expression of the same idea look at Cool URIs for the Semantic Web.

These two documents describe a simple use of HTTP and URLs to identify resources that are outside of the information space of the web. Yes, you read that right: resources that are outside the information space of the web. Why would I want to use URLs to address resources that aren’t on the web!? The finding illustrates this subtlety using Angela’s dilemma:

Angela is creating an OWL ontology that defines specific characteristics of devices used to access the Web. Some of these characteristics represent physical properties of the device, such as its length, width and weight. As a result, the ontology includes concepts such as unit of measure, and specific instances, such as meter and kilogram. Angela uses URIs to identify these concepts.Having chosen a URI for the concept of the meter, Angela faces the question of what should be returned if that URI is ever dereferenced. There is general advice that owners of URIs should provide representations [AWWW] and Angela is keen to comply. However, the choices of possible representations appear legion. Given that the URI is being used in the context of an OWL ontology, Angela first considers a representation that consists of some RDF triples that allow suitable computer systems to discover more information about the meter. She then worries that these might be less useful to a human user, who might prefer the appropriate Wikipedia entry. Perhaps, she reasons, a better approach would be to create a representation which itself contains a set of URIs to a range of resources that provide related representations. Perhaps content negotiation can help? She could return different representations based on the content type specified in the request.

Angela’s dilemma is, of course, based on the fact that none of the representations she is considering are actually representations of the units of measure themselves. Even if the Web could deliver a platinum-iridium bar with two marks a meter apart at zero degrees celsius, or 1,650,763.73 wavelengths of the orange-red emission line in the electromagnetic spectrum of the krypton-86 atom in a vacuum, or even two marks, a meter apart on a screen, such representations are probably less than completely useful in the context of an information space. The representations that Angela is considering are not representations of the meter itself. Instead, they are representations of information resources related to the meter.

It is not appropriate for any of the individual representations that Angela is considering to be returned by dereferencing the URI that identifies the concept of the meter. Not only do the representations she is considering fail to represent the concept of the meter, they each have a different essence and so they should each have their own URI. As a consequence, it would also be inappropriate to use content negotiation as a way to provide them as alternate representations when the URI for the concept of the meter is dereferenced.

So assuming we are agreed about the problem what’s the solution? Basically you can use content negotiation and a 303 See Other HTTP status code to redirect to the appropriate resource. For an example of the basic idea in action fire up curl and take a look at how this instance of the SemanticMediaWiki responds to a GET request:

%  curl --head http://ontoworld.org/wiki/Special:URIResolver/Ruby
HTTP/1.1 303 See Other
Date: Thu, 31 May 2007 20:03:12 GMT
Server: Apache/2.2.3 (Debian) ...
Location: http://ontoworld.org/wiki/Ruby
Content-Type: text/html; charset=UTF-8

Nothing too surprising there–basically just got redirected to another URL that serves up some friendly HTML describing the Ruby programming language. But send along an extra Accept header:

% curl --head  --header 'Accept: application/rdf+xml
HTTP/1.1 303 See Other
Date: Thu, 31 May 2007 20:04:36 GMT
Server: Apache/2.2.3 (Debian) ...
Location: http://ontoworld.org/wiki/Special:ExportRDF/Ruby
Content-Type: text/html; charset=UTF-8

Notice how you are redirected to another URL that results in rdf/xml describing Ruby coming down the pipe? RubyOnRails and other frameworks have good REST support built in for doing content negotiation to provide multiple representations of a single information resource. But the use of the 303 See Other here is a new subtle twist to accommodate the fact that the resource in question isn’t really a canonical set of bits on disk somewhere. The good news is that your browser will display the human readable resource when you visit http://ontoworld.org/wiki/Special:URIResolver/Ruby

Some folks would argue that resources that are outside the web don’t deserve URLs and should instead be identified with URIs like info-uris that are not required to resolve. My personal feeling is that info-uris do have a great deal of use in the enterprise (where they are most likely resolvable). But in situations like Angela’s where she is creating a public RDF document that needs to refer to concepts like “length” and “meter” I think it makes sense that these concepts should resolve to appropriate representations that will guide appropriate usage. Or as the Architecture of the World Wide Web puts it:

A URI owner may supply zero or more authoritative representations of the resource identified by that URI. There is a benefit to the community in providing representations. A URI owner SHOULD provide representations of the resource it identifies

It’ll be interesting to see how these issues shake out as more and more structured data is made available on the web.