lcsh.info SPARQL endpoint

disclaimer: lcsh.info was a prototype, and is no longer available, see id.loc.gov for the service from the Library of Congress

I’ve set up a SPARQL endpoint for lcsh.info at sparql.lcsh.info. If you are new to SPARQL endpoints, they are essentially REST web services that allow you to query a pool of RDF data using a query language that combines features of pattern matching, set logic and the web, and then get back results in a variety of formats. If you are a regular expression and/or SQL junkie, and like data, then SPARQL is definitely worth taking a look at.

If you are new to SPARQL and/or LCSH as SKOS you can try the default query and you’ll get back the first 10 triples in the triple store:

SELECT ?s ?p ?p 
WHERE {?s ?p ?o}
LIMIT 10

As a first tweak try increasing the limit to 100. If you are feeling more adventurous perhaps you’d like to look up all the triples for a concept like Buddhism:

PREFIX skos: <http://www.w3.org/2004/02/skos/core#>

SELECT ?s ?p ?o 
WHERE {
  ?s ?p ?o .
  ?s skos:prefLabel "Buddhism"@en .
}

Or, perhaps you are interested in seeing what narrower terms there are for Buddhism:

PREFIX skos: <http://www.w3.org/2004/02/skos/core#>

SELECT ?uri ?label 
WHERE {
  <http://lcsh.info/sh85017454#concept> skos:narrower ?uri .
  ?uri skos:prefLabel ?label
}

Or maybe you don’t know the skos:prefLabel (aka authorized heading), so look for all the lcsh headings that start with Independence

PREFIX skos: <http://www.w3.org/2004/02/skos/core#>

SELECT ?s ?label  
WHERE {
  ?s skos:prefLabel ?label.
  FILTER regex(?label, '^independence', 'i')
}

Feel free to use the service however you want. I’m interested in seeing what its limitations are.

Benjamin Nowack’s ARC made it extremely easy to load up the 2,441,494 LCSH triples in a few hours with a script like:

include_once('arc/ARC2.php');
 
$config = array(
    'db_name'               => 'arc',
    'db_user'               => 'arc',
    'db_pwd'                => 'notapassword',
    'store_name'            => 'lcsh',
    'store_log_inserts'     => 1,
);
 
$store = ARC2::getStore($config);
 
if (!$store->isSetup()) {
    $store->setUp();
}
 
$store->reset();
$rs = $store->query('LOAD &lt;http://lcsh.info/static/lcsh.nt&gt;');
 
print_r($rs);

Then it’s just a simple matter of putting up a php script like:

/* ARC2 static class inclusion */
include_once('arc/ARC2.php');
 
/* MySQL and endpoint configuration */
$config = array(
  /* db */
  'db_host' => 'localhost', /* optional, default is localhost */
  'db_name' => 'arc',
  'db_user' => 'arc',
  'db_pwd' => 'fakepassword',
 
  /* store name */
  'store_name' => 'lcsh',
 
 
  /* endpoint */
  'endpoint_features' => array(
    'select', 'construct', 'ask', 'describe'
  ),
  'endpoint_timeout' => 60, /* not implemented in ARC2 preview */
  'endpoint_read_key' => '', /* optional */
  'endpoint_write_key' => 'fakekey', /* optional */
  'endpoint_max_limit' => 1000, /* optional */
);
 
/* instantiation */
$ep = ARC2::getStoreEndpoint($config);
 
/* request handling */
$ep->go();

Ideally I would’ve been able to quickly bring up a SPARQL endpoint on top of the rdflib Sleepycat triple store that is being used to serve up the linked data at lcsh.info. But rather that pursuing elegance (this is kinda side work after all), I wanted to quickly put the SPARQL service out there for experimentation, and this was the quickest way for me to do that. If the service proves useful I’ll look more at what it takes to create an rdflib SPARQL service, or porting over the little python code I have to php (gasp).

Creative Commons License
lcsh.info SPARQL endpoint by Ed Summers, unless otherwise expressly stated, is licensed under a Creative Commons Attribution 4.0 International License.

3 thoughts on “lcsh.info SPARQL endpoint

  1. Would be nice if it was almost as easy to setup an endpoint with rdflib for those of us not exactly thrilled with using php. Will be curious see what you come up with.

  2. Ed, As part of testing a Flex app I have developed which queries SPARQL endpoints I have pointed it at your endpoint.

    You can see a demo of it here:

    http://ccgi.arutherford.plus.com/website/flex/dbPedia/fb3/sparqlQueryViewer/#

    Anyway, I am stuck in the first simple query. The format I request back is XML. The XML comes back but the parser throws an error because there appears to be some strange characters tagged onto the result. Specifically the number 0 and a bunch of spaces.

    You can test this your self by pulling sticking this proxy request into IE:

    http://ccgi.arutherford.plus.com/cgi-bin/sparql-proxy-al.cgi?format=application%2Fsparql%2Dresults%2Bxml&query=PREFIX%20skos%3A%20%3Chttp%3A%2F%2Fwww%2Ew3%2Eorg%2F2004%2F02%2Fskos%2Fcore%23%3E%20SELECT%20%3Fs%20%3Fp%20%3Fo%20WHERE%20%7B%20%20%20%3Fs%20%3Fp%20%3Fo%20%2E%20%20%20%3Fs%20skos%3AprefLabel%20%27Buddhism%27%40en%20%2E%20%7D%20%20LIMIT%2040&url=http%3A%2F%2Fsparql%2Elcsh%2Einfo%2F

    Just paste it in as is.

    All the best,
    Al.

Leave a Reply