Peter Norvig of Google mentioned Linked Data in his interview with Reddit Ask Me Anything (thanks Gunnar)
So right from the start researchers are writing code that use our main APIs that
are using the data that everyone else uses. If you want some web pages you use
the full copy of the web. If you want some [...]
As the last post indicated I’m part of a team at loc.gov working on an application that serves up page views like this for historic newspapers–almost a million of them in fact. For each page view there is another URL for a view of the OCR text gleaned from that image, such as this. Yeah, [...]
Thursday, January 29, 2009
While lcsh.info was up and running harvesters actively crawled it. At its core all lcsh.info did was mint a URI for every Library of Congress Subject Heading. This is similar in spirit to Brewster Kahle’s more ambitious OpenLibrary project to mint a URI for every book, or in his words:
One web page for every book
Aside: [...]
Thursday, January 22, 2009
Today’s Guardian article Why you can’t find a library book in your search engine prompted me to look at Worldcat’s robots.txt file for the first time. Part of the beauty of the web is that it’s an open information space where anyone (people and robots) can start with a single URL and follow their nose [...]
Andy reminds me that a relatively simple idea (I think it was David’s at RepoCamp) for the OAI-ORE Challenge would be to create a tool that transformed OAI-ORE resource maps expressed as Atom into Google Site Maps. This would allow “repositories” that exposed their “objects” as resource maps, to easily be crawled by Google and [...]