<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>inkdroid &#187; philosophy</title>
	<atom:link href="http://inkdroid.org/journal/category/philosophy/feed/" rel="self" type="application/rss+xml" />
	<link>http://inkdroid.org/journal</link>
	<description>$pithy_personal_mission_statement</description>
	<lastBuildDate>Wed, 28 Jul 2010 13:48:50 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>Documents</title>
		<link>http://inkdroid.org/journal/2009/09/10/documents/</link>
		<comments>http://inkdroid.org/journal/2009/09/10/documents/#comments</comments>
		<pubDate>Fri, 11 Sep 2009 03:32:16 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[libraries]]></category>
		<category><![CDATA[life]]></category>
		<category><![CDATA[philosophy]]></category>
		<category><![CDATA[documents]]></category>
		<category><![CDATA[history]]></category>
		<category><![CDATA[http]]></category>
		<category><![CDATA[semweb]]></category>
		<category><![CDATA[web]]></category>

		<guid isPermaLink="false">http://inkdroid.org/journal/?p=1172</guid>
		<description><![CDATA[I&#8217;ve struggled in the past with what constitutes an Information Resource in the context of Web Architecture, Linked Data and practical digital library applications such as the National Digital Newspaper Project I work on at the Library of Congress. So it was reassuring to see the issue come up a few months ago during a [...]]]></description>
			<content:encoded><![CDATA[<div xmlns:dct="http://purl.org/dc/terms/" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:bibo="http://purl.org/ontology/bibo/">
<a about="/images/otlet.jpg" rel="foaf:depicts" href="http://chroniclingamerica.loc.gov/lccn/sn84026749/1908-04-09/ed-1/seq-11#page"><br />
<img src="/images/otlet.jpg" style="float: right; margin-left: 15px; width: 200px;" /><br />
</a><br />
I&#8217;ve <a href="http://inkdroid.org/journal/2009/05/14/rest-the-semantic-web-and-my-feeble-brain/">struggled</a> in the past with what constitutes an <em>Information Resource</em> in the context of <a href="http://www.w3.org/TR/webarch/#id-resources">Web Architecture</a>, <a href="http://www.w3.org/DesignIssues/LinkedData.html">Linked Data</a> and practical digital library applications such as the <a href="http://chroniclingamerica.loc.gov">National Digital Newspaper Project</a> I work on at the Library of Congress. So it was reassuring to see the issue come up a few months ago during a <a href="http://lists.w3.org/Archives/Public/www-tag/2009Jun/0056.html">review</a> of the effort to <a href="http://www.ietf.org/dyn/wg/charter/httpbis-charter.html">revise</a> the HTTP specification (<a href="http://www.w3.org/Protocols/rfc2616/rfc2616.html">RFC 2616</a>). It would be a major effort to summarize the entire conversation here. However an interesting sub-discussion circled around the idea of normalizing the language in the Architecture of the World Wide Web and RFC 2616  with respect to <em>Resources</em>.</p>
<p>Well into the multi-month thread Tim Berners-Lee offered up a very helpful, historical <a href="http://lists.w3.org/Archives/Public/www-tag/2009Aug/0000.html">recap</a> of the &#8220;what is a resource&#8221; issue , in which he said:</p>
<blockquote about="#quote1" typeof="bibo:Quote">
<p id="quote1" property="bibo:content">I would like to see what the documents [AWWW and RFC 2616] all look like if edited to use the words Document and Thing, and eliminate Resource.</p>
<p><cite><a rel="dct:source" href="http://www.w3.org/DesignIssues/TermResource.html">A Short History of &#8220;Resource&#8221;</a></cite></p>
</blockquote>
<p>Which, somewhat predictably, started a discussion of what a <em>Document</em> is. However this conversation seemed more tangible and earthy, and culminated in <a href="http://larry.masinter.net/">Larry Masinter</a> <a href="http://lists.w3.org/Archives/Public/www-tag/2009Aug/0010.html">recommending</a> David M. Levy&#8217;s book <a href="http://openlibrary.org/b/OL3947422M/Scrolling_forward">Scrolling Forward</a>:</p>
<blockquote about="#quote2" typeof="bibo:Quote">
<p id="quote2" property="bibo:content">&#8230; since much of the thought behind it informs a lot of my own thinking about the nature of &#8220;Document&#8221;, &#8220;representation&#8221;,  &#8220;Resource&#8221; and the like.</p>
<p><cite><a rel="dct:source" href="http://lists.w3.org/Archives/Public/www-tag/2009Aug/0010.html">www-tag email message</a></cite>
</p></blockquote>
<p>Now Larry is a scientist at <a href="http://adobe.com">Adobe</a>, a company that knows a thing or two about electronic documents. He also works closely with the W3C and IETF on web architectural issues. So when he suggested reading a book to learn what he means by <em>Document</em> my ears perked up. The interjection of a book reference into this rapid-fire email exchange was like a magic spell, that made me pause, and consider that a working definition of <em>Document</em> was nuanced enough to be the subject matter of an entire book. </p>
<p><a href="http://openlibrary.org/b/OL3947422M"><br />
<img src="http://covers.openlibrary.org/b/olid/OL3947422M-M.jpg" rel="foaf:depicts" resource="http://openlibrary.org/b/OL3947422M" style="float: left; margin-right: 15px; border: none;" /></a></p>
<p>I&#8217;ve come to expect references to Michael Buckland&#8217;s classic <a href="http://people.ischool.berkeley.edu/~buckland/whatdoc.html">What is a Document?</a> in discussions of documents. I hadn&#8217;t run across David Levy&#8217;s name before so Larry&#8217;s recommendation was enough for me to request it from the stacks, and give it a read. I wasn&#8217;t disappointed. Scrolling Forward is an ode to documents of all shapes and sizes, from all time periods. It&#8217;s a joyful, mind expanding work, that explores the entire landscape of our documents: from cash register receipts, the multi-editioned Leaves of Grass, email messages, letters, books, photographs, papyrus scrolls, greeting cards and web pages. Since this takes place in 212 pages, it is not surprising that the analysis <span id="more-1172"></span>synthesizes rather than being exhaustive. Having received a doctorate in computer science from Stanford, obtained a diploma in calligraphy and bookbinding from the Roehampton Institute, and then worked at Xerox PARC studying the nature of documents for 15 years, Levy&#8217;s own professional career is marked by a bringing together of scientific and humanistic disciplines. </p>
<p>One of the key messages of the book is a working definition of the Document. Levy&#8217;s draws out his definition largely in contrast to a statement made by David Weinberger in his 1996 Wired piece  <a href="http://www.wired.com/wired/archive/4.08/document.html">What&#8217;s a Document?</a> where he says: </p>
<blockquote about="#quote3" typeof="bibo:Quote">
<p id="quote3" property="bibo:content">The fact that we can&#8217;t even say what a document is anymore indicates the profundity of the change we are undergoing in how we interact with information and, ultimately, our world.</p>
<p><cite><a rel="dct:source" href="http://www.wired.com/wired/archive/4.08/document.html">What is a Document?</a></cite>
</p></blockquote>
<p>To which Levy responds:</p>
<blockquote about="#quote4">
<p id="quote4" property="bibo:content">We <strong>can</strong> say what a document is. Doing this, however, requires a somewhat different approach from that which dictionaries take. It requires going beyond word usage. It does require looking at the relevant technologies, but in such a way that we aren&#8217;t fixated on them, that we don&#8217;t fetishize them. Most of all, it requires immersing ourselves in the social roles these technologies play.</p>
<p><cite><a href="http://openlibrary.org/b/OL3947422M" rel="dct:source">Scrolling Forward</a> p. <span property="bibo:pages">23</span></cite>
</p></blockquote>
<p>So Scrolling Forward is a survey of sorts; a survey of document types that are inextricably linked to the social contexts in which they were created. This approach to <em>describing</em> rather than positing a theory of documents dove-tailed nicely with some reading of Wittgenstein I&#8217;ve been doing recently. In Wittgenstein&#8217;s later period he eschewed positing philosophical theories, but instead attempted to resolve philosophical problems by exploring the richness of language and its use in social settings, or <em>language games</em>, to lay bare the problem in a therapeutic way. Levy takes a similar approach in simply laying out the complex, sometimes contradictory history of documents before us, instead of carving out a logical argument and selecting facts to support it.</p>
<p>Some parts of the book that were of particular interest to me (as a software developer working in the area of digital preservation) were the sections discussing document fixity:</p>
<blockquote about="#quote5" typeof="bibo:Quote">
<p property="bibo:content">&#8230; paper documents, and indeed all documents are static <em>and</em> changing, fixed <em>and</em> fluid. There is a reason why text and graphics editors have a Save button, after all.</p>
<p><cite><a href="http://openlibrary.org/b/OL3947422M" rel="dct:source">Scrolling Forward</a> p. <span property="bibo:pages">36</span></cite>
</p></blockquote>
<p>Also of interest was Levy&#8217;s analysis of why the idea of &#8220;digital libraries&#8221; is such a lightning rod of opinion (which perhaps applies to its sister concept &#8220;repositories&#8221;).</p>
<blockquote about="#quote6">
<p id="quote6" property="bibo:content">[The] ambiguity between institution and collection is carried through in the phrase &#8220;digital library&#8221;. For some groups, most notably librarians, the phase refers most directly to institutions that oversee digital collections, while for other professionals, primarily computer and information scientists, it refers to digital collections, without regard to the institutional settings (if any) in which they might be managed &#8230; Digital library, it seems to me, draws much of its power from this ambiguity: it provides a name for collections of digital materials that invokes the aura of the modern library and its social mission (library as social institution). But it does so without actually making any commitments to the public good (library as collection).</p>
<p><cite><a href="http://openlibrary.org/b/OL3947422M" rel="dct:source">Scrolling Forward</a> p. <span bibo:pages">135</span></cite>
</p></blockquote>
<p>And finally, Levy doesn&#8217;t shy away from the big questions of how our psychological and religious impulses influence our notions of what documents are. </p>
<blockquote about="#quote7" typeof="bibo:Quote">
<p id="quote7" property="bibo:content">The human search for and construction of order [...] is our response to the profound mystery, and accompanying anxiety, of existence. Emerging into an unfathomable universe and fearing we are nothing within it, we strive to create a meaningful and ultimately immortal place for ourselves [...] Culture creates the conditions for a meaningful existence, for us to play out our games of physical and symbolic survival. But it is an ongoing performance, a play we can never stop performing, lest we see the back-stage gears and levers and be reminded of the mysterious and terrifying backdrop against which we are performing it. [Documents] are death-transcending, lack-filling artifacts of major proportions. Perhaps they can&#8217;t literally prevent our physical demise or fill our deepest sense of lack. But they are the central participants in our attempts to do so. Every one of them &#8212; each cash register receipt, each greeting card, each Post-it note &#8212; makes a contribution to the collaborative edifice we call human culture. Although few carry the weight of the Bible or the Constitution, all of them inform us of &#8220;what is and what we should do&#8221;. And in concert they help us create and sustain an orderly, and meaningful human lifeworld.</p>
<p><cite><a href="http://openlibrary.org/b/OL3947422M" rel="dct:source">Scrolling Forward</a> pp. <span property="bibo:pages">187-188</span></cite>
</p></blockquote>
<p>Heady stuff to be sure. And now I feel like I&#8217;ve traveled far from the beginning of this blog post, and the definition of information resources and the semantic web. Scrolling Forward has given me a very personal perspective on what documents are, and have been&#8211;and as a result I&#8217;m a bit more hopeful about the future of electronic documents. Working in digital preservation, it&#8217;s sometimes pretty easy to give in to despair. I&#8217;m not sure what the the application of this perspective is towards the normalization of language in the Architecture of the World Wide Web and RFC 2616. But it seems certain that part of the answer lies in not taking our information technologies too seriously, and trying to stay focused on the roles that they play in our individual and collective lives:</p>
<blockquote about="#quote8" typeof="bibo:Quote">
<p id="quote8" property="bibo:content">We make a mistake, I believe, when we fixate on particular forms and technologies, taking them, in and of themselves, to be the carriers of what we want either to embrace or resist. Not only do we fail to see the forms and technologies in their full complexity, but we use them, in their symbolic simplicity, as blunt instruments with which to beat one another over the head.</p>
<p>        <cite><a rel="dct:source" href="http://openlibrary.org/b/OL3947422M">Scrolling Forward</a> p. <span property="bibo:pages">198</span></cite></p>
</blockquote>
</div>
<p>PS. The bibliography is a great source of new material to read too.<br />
PSS. This blog post was also a not-so-secret experiment in using <a href="http://www.w3.org/TR/xhtml-rdfa-primer/">RDFa</a> and the<a href="http://bibliontology.com/"> Bibliographic Ontology</a> to mark up quotations. Check out the <a href="http://www.w3.org/2007/08/pyRdfa/extract?format=turtle&#038;uri=http://inkdroid.org/journal/2009/09/10/documents/">rdf assertions</a> you can extract from it using the <a href="http://www.w3.org/2007/08/pyRdfa/">RDFa Distiller</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://inkdroid.org/journal/2009/09/10/documents/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>open to view</title>
		<link>http://inkdroid.org/journal/2009/08/13/open-to-view/</link>
		<comments>http://inkdroid.org/journal/2009/08/13/open-to-view/#comments</comments>
		<pubDate>Thu, 13 Aug 2009 15:54:41 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[libraries]]></category>
		<category><![CDATA[philosophy]]></category>
		<category><![CDATA[semweb]]></category>
		<category><![CDATA[web]]></category>
		<category><![CDATA[atom]]></category>
		<category><![CDATA[books]]></category>
		<category><![CDATA[hathitrust]]></category>
		<category><![CDATA[linkeddata]]></category>
		<category><![CDATA[opensearch]]></category>
		<category><![CDATA[rest]]></category>
		<category><![CDATA[semanticweb]]></category>

		<guid isPermaLink="false">http://inkdroid.org/journal/?p=1103</guid>
		<description><![CDATA[I spent an hour checking out the HathiTrust API docs this morning; mainly to see what the similarities and differences are with the as-of-yet undocumented API for Chronicling America. There are quite a few similarities in the general RESTful approach, and the use of Atom, METS and PREMIS in the metadata that is made available. [...]]]></description>
			<content:encoded><![CDATA[<p>I spent an hour checking out the <a href="http://www.hathitrust.org/data_api">HathiTrust API docs</a> this morning; mainly to see what the similarities and differences are with the as-of-yet undocumented API for <a href="http://chroniclingamerica.loc.gov">Chronicling America</a>. There are quite a few similarities in the general RESTful approach, and the use of Atom, METS and PREMIS in the metadata that is made available. </p>
<p>Everyone&#8217;s a critic right? Nevertheless, I&#8217;m just going to jot down a few thoughts about the API, mainly for my friend over in <a href="irc://chat.freenode.net/code4lib">#code4lib</a> <a href="http://billdueber.com/">Bill Dueber</a> who works on the project. Let me just say at the outset that I think it&#8217;s awesome that HathiTrust are providing this API, especially given some of the licensing constraints around some of the content. The API is a good example of putting library data on the web using both general and special purpose standards. But there are a few minor things that could be tweaked I think, to make the API fit into the web and the repository space a bit better.</p>
<p>it would be nice if the <a href="http://opensearch.org">OpenSearch</a> description document referenced in the <a href="http://catalog.hathitrust.org">HTML</a> at </p>
<blockquote><p>
<a href="http://catalog.hathitrust.org/Search/OpenSearch?method=describe ">http://catalog.hathitrust.org/Search/OpenSearch?method=describe</a>
</p></blockquote>
<p>worked. It should be pretty easy and non-invasive to add a basic description file for the HTML response since the search is already GET driven. Ideally it would be nice to see the responses also available as Atom and/or JSON with <a href="http://tools.ietf.org/html/rfc5005">Atom Feed Paging</a>. </p>
<p>Another thing that would be nice to see is the API being merged more into the human usable webapp. The best way to explain this is with an example. Consider the HTML page for this 1914 edition of Walt Whitman&#8217;s <a href="http://catalog.hathitrust.org/Record/00020629">Leaves of Grass</a>, available with this clean URI:</p>
<blockquote><p>
<a href="http://catalog.hathitrust.org/Record/000206297">http://catalog.hathitrust.org/Record/000206297</a>
</p></blockquote>
<p>Now, you can get a <a href="http://services.hathitrust.org/api/htd/meta/mdp.39015056032132">few</a> <a href="http://services.hathitrust.org/api/htd/structure/mdp.39015056032132">flavors</a> of metadata for this book, and an aggregated <a href="https://services.hathitrust.org/api/htd/aggregate/mdp.39015056032132">zip file</a> of all the page images and OCR if you are a HathiTrust member. Why not make these alternate representations discoverable right from the item display? It could be as simple as adding some &lt;link&gt; elements to the HTML, that use the link relations they&#8217;ve already established for their Atom:</p>

<div class="wp_syntax"><div class="code"><pre class="html" style="font-family:monospace;">&lt;head&gt;
&lt;link rel=&quot;http://schemas.hathitrust.org/htd/2009#meta&quot; 
    type=&quot;application/atom+xml&quot; 
    href=&quot;http://services.hathitrust.org/api/htd/meta/mdp.39015056032132&quot; /&gt;
&lt;link rel=&quot;http://schemas.hathitrust.org/htd/2009#structure &quot; 
    type=&quot;application/atom+xml&quot; 
    href=&quot;http://services.hathitrust.org/api/htd/structure/mdp.39015056032132&quot; /&gt;
&lt;link rel=&quot;http://schemas.hathitrust.org/htd/2009#aggregate&quot; 
    type=&quot;application/zip&quot; 
    href=&quot;https://services.hathitrust.org/api/htd/aggregate/mdp.39015056032132&quot; /&gt;
&lt;/head&gt;</pre></div></div>

<p>If you wanted to get fancy you could also put human readable links into the &lt;body&gt; and annotate them w/ <a href="http://www.w3.org/TR/xhtml-rdfa-primer/">RDFa</a>. But this would just be icing on the cake. There are a few reasons for doing at least the bare minimum. The big one is to enable in browser applications (like <a href="http://zotero.org">Zotero</a>, etc) to be able to learn more about a given resource in a relatively straightforward and commonplace way. The other big one is to let automated agents like <a href="http://www.google.com/bot.html">GoogleBot</a> and <a href="http://help.yahoo.com/help/us/ysearch/slurp">YahooSlurp</a> and Internet Archive&#8217;s <a href="http://crawler.archive.org/">Heritrix</a>, etc. discover the <a href="http://en.wikipedia.org/wiki/Deep_Web">deep web</a> data that&#8217;s held behind your API. Another nice side effect is that it helps people who might ordinarily scrape your site automatically discover the API in a straightforward way.</p>
<p>Lastly, I was curious to know if HathiTrust considered adjusting their Atom response to use the <a href="http://www.openarchives.org/ore/1.0/atom.html">Atom pattern</a> recommended by the OAI-ORE folks. They are pretty close already, and in fact seem to have modeled their own aggregation vocabulary on OAI-ORE. It would be interesting to hear why they diverged if it was intentional, and if it might be possible to use a bit of oai-ore in there so we can bootstrap an oai-ore harvesting ecosystem.</p>
<p>I&#8217;m <a href="http://iandavis.com/blog/2009/07/the-linked-data-brand">not sure</a> that I can still call this approach to integrating web2.0 APIs into web1.x applications <em>Linked Data</em> anymore, since it doesn&#8217;t really involve RDF directly. It does  involve thinking in a RESTful way about the resources you are publishing on the web, and how they can be linked together to form a graph. My colleague <a href="http://onebiglibrary.net">Dan</a> has been writing in Computers in Libraries recently about how perhaps thinking in terms of &#8220;building a better web&#8221; may be a more accurate way of describing this activity. </p>
<p>For reasons I don&#8217;t fully understand I&#8217;ve been reading a lot of Wittgenstein (well mainly books about Wittgenstein honestly) lately during the non-bike commute. The trajectory of his thought over his life is really interesting to me. He had this zen-like, controversial idea that </p>
<blockquote><p>
Philosophy simply puts everything before us, nor deduces anything. — Since everything lies open to view there is nothing to explain. For what is hidden, for example, is of no interest to us. <a href="http://books.google.com/books?id=ici7FXQZsFIC&#038;lpg=PP1&#038;dq=philosophical%20investigations&#038;pg=PA43-IA1#v=onepage&#038;q=126&#038;f=false">(PI 126)</a>
</p></blockquote>
<p>I really like this idea that our data APIs on the web could be &#8220;open to view&#8221; by checking out the HTML, following your nose, and writing scrapers, bots and browser plugins to use what you find. I think it&#8217;s unfortunate that the recent changes to the <a href="http://www.w3.org/DesignIssues/LinkedData.html">Linked Data Design Issues</a>, and the ensuing <a href="http://cloudofdata.com/2009/07/does-linked-data-need-rdf/">discussion</a> seemed to create this dividing line about the use of RDF and SPARQL. I had always hoped (and continue to hope) that the Linked Data effort is bigger than a particular brand, or reformulation of the semantic web effort &#8230; for me it&#8217;s a pattern for building a better web. I think RDF is very well suited to expressing the core nature of the web, the <a href="http://dig.csail.mit.edu/breadcrumbs/node/215">Giant Global Graph</a>. I&#8217;ve served up RDF representations in applications I&#8217;ve worked on just for this reason. But I think Linked Data pattern will thrive most if it is thought of as an inclusive continuum of efforts, similar to what <a href="http://webofdata.wordpress.com/2009/07/20/what-else/#comment-132">Dan Brickley</a> has suggested. Us technology people strive for explicitness, it&#8217;s an occupational hazard &#8212; but there&#8217;s sometimes quite a bit of strength in ambiguity.</p>
<p>Anyhow, my little review of the HathiTrust API turned into a bit of a soapbox for me to stand on and shout like a lunatic. I guess I&#8217;ve been wanting to write about what I think Linked Data is for a few weeks now, and it just kinda bubbled up when I least expected it. Sorry Bill!</p>
]]></content:encoded>
			<wfw:commentRss>http://inkdroid.org/journal/2009/08/13/open-to-view/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
	</channel>
</rss>
