I’m just now running across lingvoj.org, a linked-data application for languages created by Bernard Vatant. lingvoj basically mints URIs for languages (using the ISO-639-1 code) and when resolved (yay HTTP) nice human and machine readable descriptions about the language are returned. So for example the URI for Chinese is:


If you click on that link, your browser will display some HTML that describes the Chinese language, and if a client wants “application/rdf+xml” it’ll get back a nice chunk of rdf – all via a 303 redirect as it should be.

lingvoj is interesting for a few reasons:

  • I work at the Library of Congress, who are the maintainers of iso639-2, and I know someone experimenting with a linked-data application for delivering it.
  • I know software developers at LC and elsewhere who need access to this data in a predictable and explicit machine readable format, which lends itself to being updated (re-harvesting language URIs).
  • lingvoj follows the 303 URIs forwarding to One Generic Document pattern, which is nice to see in practice. I also learned about the use of rdfs:isDefinedBy to assert (in this case) that a language is defined by the HTML representation for the language. Not sure how I missed that in the Cool URIs document before.
  • There are owl:sameAs links between lingvoj and dbpedia and opencyc, which in turn are linked data, and allow an agent to walk outwards and discover more about a language. Maybe one day lingvoj could link to our ISO693-2 codelist at LC?
  • lingvoj defines a vocabulary which includes a new OWL class Lingvo for languages, that happens to extend dcterms:LinguisticSystem.

It’s a lot o’ fun discovering this emerging, rich data-universe on the web. If you are the least bit curious take a look for yourself:

  curl --location --header "Accept: application/rdf+xml" http://www.lingvoj.org/lang/zh

Or better yet:

  rapper -o turtle http://lingvoj.org/lang/zh

Or if you are really adventurous grab the whole data set and put it into your triple-store-du-jour.