<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule">

<channel>
	<title>inkdroid</title>
	<atom:link href="http://inkdroid.org/journal/feed/" rel="self" type="application/rss+xml" />
	<link>http://inkdroid.org/journal</link>
	<description>paper or plastic?</description>
	<lastBuildDate>Sat, 18 May 2013 16:49:23 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
<creativeCommons:license>http://creativecommons.org/licenses/by/3.0/</creativeCommons:license>		<item>
		<title>maps on the web with a bit of midlife crisis</title>
		<link>http://inkdroid.org/journal/2013/05/10/maps-on-the-web-with-a-bit-of-midlife-crisis/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=maps-on-the-web-with-a-bit-of-midlife-crisis</link>
		<comments>http://inkdroid.org/journal/2013/05/10/maps-on-the-web-with-a-bit-of-midlife-crisis/#comments</comments>
		<pubDate>Fri, 10 May 2013 19:08:25 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[programming]]></category>
		<category><![CDATA[wikipedia]]></category>
		<category><![CDATA[html]]></category>
		<category><![CDATA[javascript]]></category>
		<category><![CDATA[maps]]></category>
		<category><![CDATA[nodejs]]></category>
		<category><![CDATA[openstreetmap]]></category>
		<category><![CDATA[rest]]></category>

		<guid isPermaLink="false">http://inkdroid.org/journal/?p=5679</guid>
		<description><![CDATA[TL;DR &#8212; I created a JavaScript library for getting GeoJSON out of Wikipedia&#8217;s API in your browser (and Node.js). I also created a little app that uses it to display Wikipedia articles for things near you that need a photograph/image or editorial help. I probably don&#8217;t need to tell you how much the state of [...]]]></description>
				<content:encoded><![CDATA[<p><em>TL;DR &#8212; I created <a href="http://edsu.github.io/wikigeo/">a JavaScript library</a> for getting GeoJSON out of Wikipedia&#8217;s API in your browser (and Node.js). I also created <a href="http://inkdroid.org/ici/">a little app</a> that uses it to display Wikipedia articles for things near you that need a photograph/image or editorial help.</em></p>
<hr />
<p>I probably don&#8217;t need to tell you how much the state of mapping on the Web has changed in the past few years. <a href="http://www.youtube.com/watch?v=6xG4oFny2Pk">I was there</a>. I can remember trying to get <a href="http://mapserver.org/">MapServer</a> set up in the late 1990s, with limited success. I was there squinting at how <a href="http://en.wikipedia.org/wiki/Adrian_Holovaty">Adrian Holovaty</a> reverse engineered a mapping API out of Google Maps at <a href="http://web.archive.org/web/20060408105215/http://www.chicagocrime.org/map/">chicagocrime.org</a>. I was there when Google released their official API, which I used some, and then they changed their terms of service. I was there in the late 2000s using OpenLayers and TileCache, which were so much more approachable than MapServer was a decade earlier. I&#8217;m most definitely not a mapping expert, or even an amateur&#8211;but you can&#8217;t be a Web developer without occasionally needing to dabble, and pretend you are.</p>
<p>I didn&#8217;t realize until very recently how easy the cool kids have made it to put maps on the Web. Who knew that in 2013 there would be an open source JavaScript library that lets you add a map to your page in a few lines, and that it&#8217;s in use by Flickr, FourSquare, CraigsList, Wikimedia, the Wall Street Journal, and others? Even more astounding: who knew there would be an openly licensed source of map tiles and data, that was created collaboratively by a project with over a million registered users, and that it would be good enough to be used by Apple? I certainly didn&#8217;t even dream about it.</p>
<p>Ok, hold that thought&#8230;</p>
<p>So, Wikipedia <a href="https://blog.wikimedia.org/2013/03/28/add-an-image-to-this-article-uploads-now-live-on-mobile-wikipedia/">recently announced</a> that they were making it easy to use your mobile device to add a photograph to a Wikipedia article that lacked an image. </p>
<div style="width: 60%; margin-left: auto; margin-right: auto; margin-top: 10px; margin-bottom: 10px; border: thin solid #eeeeee;">
<a href="http://www.flickr.com/photos/inkdroid/8726826906/"><img src="http://farm8.staticflickr.com/7318/8726826906_2e88f9ab6b_b.jpg" width="400"/></a>
</div>
<p>When I read about this I thought it would be interesting to see what Wikipedia articles there are about my current location, and which lacked images, so I could go and take pictures of them. Before I knew it I had <a href="http://inkdroid.org/ici/">a Web app</a> called ici (French for here) that does just that:</p>
<p><a href="http://inkdroid.org/ici/#lat=38.89591781652618&#038;lon=-77.0342230796814&#038;zoom=16"><img src="http://inkdroid.org/images/ici.png"/></a></p>
<p>Articles that need images are marked with little red cameras. It was pretty easy to add orange markers for Wikipedia articles that had been flagged as needing edits, or citations. Calling it an app is an overstatement: it is just static HTML, JavaScript and CSS that I serve up. HTML&#8217;s <a href="http://diveintohtml5.info/geolocation.html">geolocation</a> features and <a href="http://en.wikipedia.org/api.php">Wikipedia&#8217;s API</a> (which has <a href="http://www.mediawiki.org/wiki/Extension:GeoData">GeoData</a> enabled) take care of the rest. </p>
<p>After I created the app I got a tweet from a <em>real</em> geo-hacker, Sean Gillies, who asked:</p>
<blockquote class="twitter-tweet" width="550"><p>@<a href="https://twitter.com/edsu">edsu</a> I&#8217;d love to help Wikipedia get some GeoJSON in their API results. Then you could use <a href="http://t.co/u7e0ayCMIy" title="http://leafletjs.com/examples/geojson.html">leafletjs.com/examples/geojs…</a>.</p>
<p>&mdash; Sean Gillies (@sgillies) <a href="https://twitter.com/sgillies/status/332185543234441216">May 8, 2013</a></p></blockquote>
<p><script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script></p>
<p>Sean is right, it would be really useful to have a GeoJSON output from Wikipedia&#8217;s API. But I was on a little bit of a tear, so rather than figuring out how to get GeoJSON into MediaWiki and deployed to all the Wikipedia servers I wondered if I could extract ici&#8217;s use of the Wikipedia API into a slightly more generalized JavaScript library, that would make it easy to get GeoJSON out of Wikipedia&#8211;at least from JavaScript. That quickly resulted in <a href="http://edsu.github.io/wikigeo/">wikigeo.js</a> which is now getting used in ici. Getting GeoJSON from Wikipedia using wikigeo.js is done in just one line, and then <a href="http://leafletjs.com/examples/geojson.html">adding the GeoJSON to a map in Leaflet</a> can also be done in one line:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="javascript" style="font-family:monospace;">geojson<span style="color: #009900;">&#40;</span><span style="color: #009900;">&#91;</span><span style="color: #339933;">-</span><span style="color: #CC0000;">73.94</span><span style="color: #339933;">,</span> <span style="color: #CC0000;">40.67</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">,</span> <span style="color: #000066; font-weight: bold;">function</span><span style="color: #009900;">&#40;</span>data<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #006600; font-style: italic;">// add the geojson to a Leaflet map</span>
    L.<span style="color: #660066;">geoJson</span><span style="color: #009900;">&#40;</span>data<span style="color: #009900;">&#41;</span>.<span style="color: #660066;">addTo</span><span style="color: #009900;">&#40;</span>map<span style="color: #009900;">&#41;</span>
<span style="color: #009900;">&#125;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></td></tr></table></div>

<p>This call results in callback getting some GeoJSON data that looks something like:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="javascript" style="font-family:monospace;"><span style="color: #009900;">&#123;</span>
  <span style="color: #3366CC;">&quot;type&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;FeatureCollection&quot;</span><span style="color: #339933;">,</span>
  <span style="color: #3366CC;">&quot;features&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
    <span style="color: #009900;">&#123;</span>
      <span style="color: #3366CC;">&quot;id&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;http://en.wikipedia.org/wiki/New_York_City&quot;</span><span style="color: #339933;">,</span>
      <span style="color: #3366CC;">&quot;type&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Feature&quot;</span><span style="color: #339933;">,</span>
      <span style="color: #3366CC;">&quot;properties&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #3366CC;">&quot;name&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;New York City&quot;</span>
      <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
      <span style="color: #3366CC;">&quot;geometry&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #3366CC;">&quot;type&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Point&quot;</span><span style="color: #339933;">,</span>
        <span style="color: #3366CC;">&quot;coordinates&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
          <span style="color: #339933;">-</span><span style="color: #CC0000;">73.94</span><span style="color: #339933;">,</span>
          <span style="color: #CC0000;">40.67</span>
        <span style="color: #009900;">&#93;</span>
      <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
    <span style="color: #009900;">&#123;</span>
      <span style="color: #3366CC;">&quot;id&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;http://en.wikipedia.org/wiki/Kingston_Avenue_(IRT_Eastern_Parkway_Line)&quot;</span><span style="color: #339933;">,</span>
      <span style="color: #3366CC;">&quot;type&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Feature&quot;</span><span style="color: #339933;">,</span>
      <span style="color: #3366CC;">&quot;properties&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #3366CC;">&quot;name&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Kingston Avenue (IRT Eastern Parkway Line)&quot;</span>
      <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
      <span style="color: #3366CC;">&quot;geometry&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #3366CC;">&quot;type&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Point&quot;</span><span style="color: #339933;">,</span>
        <span style="color: #3366CC;">&quot;coordinates&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
          <span style="color: #339933;">-</span><span style="color: #CC0000;">73.9422</span><span style="color: #339933;">,</span>
          <span style="color: #CC0000;">40.6694</span>
        <span style="color: #009900;">&#93;</span>
      <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
    <span style="color: #009900;">&#123;</span>
      <span style="color: #3366CC;">&quot;id&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;http://en.wikipedia.org/wiki/Crown_Heights_–_Utica_Avenue_(IRT_Eastern_Parkway_Line)&quot;</span><span style="color: #339933;">,</span>
      <span style="color: #3366CC;">&quot;type&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Feature&quot;</span><span style="color: #339933;">,</span>
      <span style="color: #3366CC;">&quot;properties&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #3366CC;">&quot;name&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Crown Heights – Utica Avenue (IRT Eastern Parkway Line)&quot;</span>
      <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
      <span style="color: #3366CC;">&quot;geometry&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #3366CC;">&quot;type&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Point&quot;</span><span style="color: #339933;">,</span>
        <span style="color: #3366CC;">&quot;coordinates&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
          <span style="color: #339933;">-</span><span style="color: #CC0000;">73.9312</span><span style="color: #339933;">,</span>
          <span style="color: #CC0000;">40.6688</span>
        <span style="color: #009900;">&#93;</span>
      <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
    <span style="color: #009900;">&#123;</span>
      <span style="color: #3366CC;">&quot;id&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;http://en.wikipedia.org/wiki/Brooklyn_Children's_Museum&quot;</span><span style="color: #339933;">,</span>
      <span style="color: #3366CC;">&quot;type&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Feature&quot;</span><span style="color: #339933;">,</span>
      <span style="color: #3366CC;">&quot;properties&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #3366CC;">&quot;name&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Brooklyn Children's Museum&quot;</span>
      <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
<span style="color: #3366CC;">&quot;geometry&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #3366CC;">&quot;type&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Point&quot;</span><span style="color: #339933;">,</span>
        <span style="color: #3366CC;">&quot;coordinates&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
          <span style="color: #339933;">-</span><span style="color: #CC0000;">73.9439</span><span style="color: #339933;">,</span>
          <span style="color: #CC0000;">40.6745</span>
        <span style="color: #009900;">&#93;</span>
      <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
    <span style="color: #009900;">&#123;</span>
      <span style="color: #3366CC;">&quot;id&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;http://en.wikipedia.org/wiki/770_Eastern_Parkway&quot;</span><span style="color: #339933;">,</span>
      <span style="color: #3366CC;">&quot;type&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Feature&quot;</span><span style="color: #339933;">,</span>
      <span style="color: #3366CC;">&quot;properties&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #3366CC;">&quot;name&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;770 Eastern Parkway&quot;</span>
      <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
      <span style="color: #3366CC;">&quot;geometry&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #3366CC;">&quot;type&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Point&quot;</span><span style="color: #339933;">,</span>
        <span style="color: #3366CC;">&quot;coordinates&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
          <span style="color: #339933;">-</span><span style="color: #CC0000;">73.9429</span><span style="color: #339933;">,</span>
          <span style="color: #CC0000;">40.669</span>
        <span style="color: #009900;">&#93;</span>
      <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
    <span style="color: #009900;">&#123;</span>
      <span style="color: #3366CC;">&quot;id&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;http://en.wikipedia.org/wiki/Eastern_Parkway_(Brooklyn)&quot;</span><span style="color: #339933;">,</span>
      <span style="color: #3366CC;">&quot;type&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Feature&quot;</span><span style="color: #339933;">,</span>
      <span style="color: #3366CC;">&quot;properties&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #3366CC;">&quot;name&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Eastern Parkway (Brooklyn)&quot;</span>
      <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
      <span style="color: #3366CC;">&quot;geometry&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #3366CC;">&quot;type&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Point&quot;</span><span style="color: #339933;">,</span>
        <span style="color: #3366CC;">&quot;coordinates&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
          <span style="color: #339933;">-</span><span style="color: #CC0000;">73.9371</span><span style="color: #339933;">,</span>
          <span style="color: #CC0000;">40.6691</span>
        <span style="color: #009900;">&#93;</span>
      <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
    <span style="color: #009900;">&#123;</span>
      <span style="color: #3366CC;">&quot;id&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;http://en.wikipedia.org/wiki/Paul_Robeson_High_School_for_Business_and_Technology&quot;</span><span style="color: #339933;">,</span>
      <span style="color: #3366CC;">&quot;type&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Feature&quot;</span><span style="color: #339933;">,</span>
      <span style="color: #3366CC;">&quot;properties&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #3366CC;">&quot;name&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Paul Robeson High School for Business and Technology&quot;</span>
      <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
      <span style="color: #3366CC;">&quot;geometry&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #3366CC;">&quot;type&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Point&quot;</span><span style="color: #339933;">,</span>
        <span style="color: #3366CC;">&quot;coordinates&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
          <span style="color: #339933;">-</span><span style="color: #CC0000;">73.939</span><span style="color: #339933;">,</span>
          <span style="color: #CC0000;">40.6755</span>
        <span style="color: #009900;">&#93;</span>
      <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
    <span style="color: #009900;">&#123;</span>
      <span style="color: #3366CC;">&quot;id&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;http://en.wikipedia.org/wiki/Pathways_in_Technology_Early_College_High_School&quot;</span><span style="color: #339933;">,</span>
      <span style="color: #3366CC;">&quot;type&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Feature&quot;</span><span style="color: #339933;">,</span>
      <span style="color: #3366CC;">&quot;properties&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #3366CC;">&quot;name&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Pathways in Technology Early College High School&quot;</span>
      <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
      <span style="color: #3366CC;">&quot;geometry&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #3366CC;">&quot;type&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Point&quot;</span><span style="color: #339933;">,</span>
        <span style="color: #3366CC;">&quot;coordinates&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
          <span style="color: #339933;">-</span><span style="color: #CC0000;">73.939</span><span style="color: #339933;">,</span>
          <span style="color: #CC0000;">40.6759</span>
        <span style="color: #009900;">&#93;</span>
      <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span>
  <span style="color: #009900;">&#93;</span>
<span style="color: #009900;">&#125;</span></pre></td></tr></table></div>

<p>There are options for broadening the radius, increasing the number of results, and fetching additional properties of the Wikipedia article such as article summaries, images, categories, templates used. Here&#8217;s an example using all the knobs:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="javascript" style="font-family:monospace;">geojson<span style="color: #009900;">&#40;</span>
  <span style="color: #009900;">&#91;</span><span style="color: #339933;">-</span><span style="color: #CC0000;">73.94</span><span style="color: #339933;">,</span> <span style="color: #CC0000;">40.67</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">,</span>
  <span style="color: #009900;">&#123;</span>
    limit<span style="color: #339933;">:</span> <span style="color: #CC0000;">5</span><span style="color: #339933;">,</span>
    radius<span style="color: #339933;">:</span> <span style="color: #CC0000;">1000</span><span style="color: #339933;">,</span>
    images<span style="color: #339933;">:</span> <span style="color: #003366; font-weight: bold;">true</span><span style="color: #339933;">,</span>
    categories<span style="color: #339933;">:</span> <span style="color: #003366; font-weight: bold;">true</span><span style="color: #339933;">,</span>
    summaries<span style="color: #339933;">:</span> <span style="color: #003366; font-weight: bold;">true</span><span style="color: #339933;">,</span>
    templates<span style="color: #339933;">:</span> <span style="color: #003366; font-weight: bold;">true</span>
  <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
  <span style="color: #000066; font-weight: bold;">function</span><span style="color: #009900;">&#40;</span>data<span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    L.<span style="color: #660066;">geoJson</span><span style="color: #009900;">&#40;</span>data<span style="color: #009900;">&#41;</span>.<span style="color: #660066;">addTo</span><span style="color: #009900;">&#40;</span>map<span style="color: #009900;">&#41;</span>
  <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></td></tr></table></div>

<p>Which results in GeoJSON like this (abbreviated)</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="javascript" style="font-family:monospace;"><span style="color: #009900;">&#123;</span>
  <span style="color: #3366CC;">&quot;type&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;FeatureCollection&quot;</span><span style="color: #339933;">,</span>
  <span style="color: #3366CC;">&quot;features&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
    <span style="color: #009900;">&#123;</span>
      <span style="color: #3366CC;">&quot;id&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;http://en.wikipedia.org/wiki/Silver_Spring,_Maryland&quot;</span><span style="color: #339933;">,</span>
      <span style="color: #3366CC;">&quot;type&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Feature&quot;</span><span style="color: #339933;">,</span>
      <span style="color: #3366CC;">&quot;properties&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #3366CC;">&quot;name&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Silver Spring, Maryland&quot;</span><span style="color: #339933;">,</span>
        <span style="color: #3366CC;">&quot;image&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Downtown_silver_spring_wayne.jpg&quot;</span><span style="color: #339933;">,</span>
        <span style="color: #3366CC;">&quot;templates&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
          <span style="color: #3366CC;">&quot;-&quot;</span><span style="color: #339933;">,</span>
          <span style="color: #3366CC;">&quot;Abbr&quot;</span><span style="color: #339933;">,</span>
          <span style="color: #3366CC;">&quot;Ambox&quot;</span><span style="color: #339933;">,</span>
          <span style="color: #3366CC;">&quot;Ambox/category&quot;</span><span style="color: #339933;">,</span>
          <span style="color: #3366CC;">&quot;Ambox/small&quot;</span><span style="color: #339933;">,</span>
          <span style="color: #3366CC;">&quot;Basepage subpage&quot;</span><span style="color: #339933;">,</span>
          <span style="color: #3366CC;">&quot;Both&quot;</span><span style="color: #339933;">,</span>
          <span style="color: #3366CC;">&quot;Category handler&quot;</span><span style="color: #339933;">,</span>
          <span style="color: #3366CC;">&quot;Category handler/blacklist&quot;</span><span style="color: #339933;">,</span>
          <span style="color: #3366CC;">&quot;Category handler/numbered&quot;</span>
        <span style="color: #009900;">&#93;</span><span style="color: #339933;">,</span>
        <span style="color: #3366CC;">&quot;summary&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Silver Spring is an unincorporated area and census-designated place (CDP) in Montgomery County, Maryland, United States. It had a population of 71,452 at the 2010 census, making it the fourth most populous place in Maryland, after Baltimore, Columbia, and Germantown.<span style="color: #000099; font-weight: bold;">\n</span>The urbanized, oldest, and southernmost part of Silver Spring is a major business hub that lies at the north apex of Washington, D.C. As of 2004, the Central Business District (CBD) held 7,254,729 square feet (673,986 m2) of office space, 5216 dwelling units and 17.6 acres (71,000 m2) of parkland. The population density of this CBD area of Silver Spring was 15,600 per square mile all within 360 acres (1.5 km2) and approximately 2.5 square miles (6 km2) in the CBD/downtown area. The community has recently undergone a significant renaissance, with the addition of major retail, residential, and office developments.<span style="color: #000099; font-weight: bold;">\n</span>Silver Spring takes its name from a mica-flecked spring discovered there in 1840 by Francis Preston Blair, who subsequently bought much of the surrounding land. Acorn Park, tucked away in an area of south Silver Spring away from the main downtown area, is believed to be the site of the original spring.<span style="color: #000099; font-weight: bold;">\n</span><span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">,</span>
        <span style="color: #3366CC;">&quot;categories&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
          <span style="color: #3366CC;">&quot;All articles to be expanded&quot;</span><span style="color: #339933;">,</span>
          <span style="color: #3366CC;">&quot;All articles with dead external links&quot;</span><span style="color: #339933;">,</span>
          <span style="color: #3366CC;">&quot;All articles with unsourced statements&quot;</span><span style="color: #339933;">,</span>
          <span style="color: #3366CC;">&quot;Articles to be expanded from June 2008&quot;</span><span style="color: #339933;">,</span>
          <span style="color: #3366CC;">&quot;Articles with dead external links from July 2009&quot;</span><span style="color: #339933;">,</span>
          <span style="color: #3366CC;">&quot;Articles with dead external links from October 2010&quot;</span><span style="color: #339933;">,</span>
          <span style="color: #3366CC;">&quot;Articles with dead external links from September 2010&quot;</span><span style="color: #339933;">,</span>
          <span style="color: #3366CC;">&quot;Articles with unsourced statements from February 2007&quot;</span><span style="color: #339933;">,</span>
          <span style="color: #3366CC;">&quot;Articles with unsourced statements from May 2009&quot;</span><span style="color: #339933;">,</span>
          <span style="color: #3366CC;">&quot;Commons category template with no category set&quot;</span>
        <span style="color: #009900;">&#93;</span>
      <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
      <span style="color: #3366CC;">&quot;geometry&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #3366CC;">&quot;type&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Point&quot;</span><span style="color: #339933;">,</span>
        <span style="color: #3366CC;">&quot;coordinates&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
          <span style="color: #339933;">-</span><span style="color: #CC0000;">77.019</span><span style="color: #339933;">,</span>
          <span style="color: #CC0000;">39.0042</span>
        <span style="color: #009900;">&#93;</span>
      <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
    ...
  <span style="color: #009900;">&#93;</span>
<span style="color: #009900;">&#125;</span></pre></td></tr></table></div>

<p>I guess this is a long way of saying, if you want to put Wikipedia articles on a map, or otherwise need GeoJSON for Wikipedia articles for a particular location, take a look at <a href="http://edsu.github.io/wikigeo/">wikigeo.js</a>. If you do, and have ideas for making it better, please let me know. Oh, by the way you can <code>npm install <a href="https://npmjs.org/package/wikigeo">wikigeo</a></code> and use it from <a href="http://nodejs.org">Node.js</a>.</p>
<p>I guess JavaScript, HTML5, NodeJS, CoffeeScript are like my midlife crisis&#8230;my red sports car. But maybe being the old guy, and losing my edge isn&#8217;t really so bad? </p>
<blockquote><p>
I&#8217;m losing my edge<br />
to better-looking people<br />
with better ideas<br />
and more talent<br />
and they&#8217;re actually<br />
really, really nice.<br />
&#8212; <a href="http://www.youtube.com/watch?v=6xG4oFny2Pk">Jim Murphy</a>
</p></blockquote>
<p>It definitely helps when the kids coming up from behind have talent and are really, really nice. You know?</p>
]]></content:encoded>
			<wfw:commentRss>http://inkdroid.org/journal/2013/05/10/maps-on-the-web-with-a-bit-of-midlife-crisis/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Everything is Data</title>
		<link>http://inkdroid.org/journal/2013/05/02/everything-is-data/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=everything-is-data</link>
		<comments>http://inkdroid.org/journal/2013/05/02/everything-is-data/#comments</comments>
		<pubDate>Thu, 02 May 2013 14:19:18 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[book review]]></category>
		<category><![CDATA[philosophy]]></category>
		<category><![CDATA[latour]]></category>
		<category><![CDATA[science]]></category>
		<category><![CDATA[social science]]></category>
		<category><![CDATA[sociology]]></category>

		<guid isPermaLink="false">http://inkdroid.org/journal/?p=5590</guid>
		<description><![CDATA[Reassembling the Social: An Introduction to Actor-Network-Theory by Bruno Latour My rating: 4 of 5 stars I picked this up because folks over on the Philosophy in a Time of Software kicked things off by discussing this book by Latour. So, I&#8217;m really not terribly knowledgeable about sociology, but I did a fair bit of [...]]]></description>
				<content:encoded><![CDATA[<p><a href="http://www.goodreads.com/book/show/134567.Reassembling_the_Social" style="float: left; padding-right: 20px"><img alt="Reassembling the Social: An Introduction to Actor-Network-Theory" border="0" src="http://d.gr-assets.com/books/1172047445m/134567.jpg" /></a><a href="http://www.goodreads.com/book/show/134567.Reassembling_the_Social">Reassembling the Social: An Introduction to Actor-Network-Theory</a> by <a href="http://www.goodreads.com/author/show/77743.Bruno_Latour">Bruno Latour</a><br />
My rating: <a href="http://www.goodreads.com/review/show/516628434">4 of 5 stars</a></p>
<p>I picked this up because folks over on the <a href="https://groups.google.com/forum/#!searchin/philosophy-in-a-time-of-software/reassembling/philosophy-in-a-time-of-software/T5PrHVZ0upI/tgiEC4dEgLoJ" rel="nofollow">Philosophy in a Time of Software</a> kicked things off by discussing this book by Latour. So, I&#8217;m really not terribly knowledgeable about sociology, but I did a fair bit of reading in the social sciences while <strike>getting my library union card</strike> studying library/information science. So I wasn&#8217;t completely underwater, but I definitely felt like I was swimming in the deep end. I didn&#8217;t get the connection to computer programming until quite late in the book, but it was definitely a bit of a lightbulb moment when I did. Latour&#8217;s style (at least that of the unmentioned translator) is refreshingly direct, personal, and unabashedly opinionated. He spends much of the book describing just how complicated social science is, and how far it has gone off the tracks&#8230;which is quite entertaining at times.</p>
<p>A few things I will take with me from this book and its portrayal of Actor Network Theory:</p>
<p>I will never be able to say or write the word &#8220;social&#8221; without feeling like I&#8217;m glossing over a whole lot of stuff, and that this stuff is what I should actually be researching, talking and writing about. Latour stresses that it&#8217;s important not to dumb things down by appealing to established social forces (class, gender, imperialism, etc) but by tracing the actors, their controversies, and their relations. This work requires discipline because it&#8217;s tempting to reduce the complexity by using these familiar abstractions instead of expending energy/effort in documenting the scenarios as faithfully as possible. By letting the actors have a voice, and say what they think they are doing, rather than the researcher telling the actor what they are actually doing. I work in libraries/archives, so I particularly liked Latour&#8217;s insistence on the importance notebooks, writing, and documentation:</p>
<blockquote><p>The best way to proceed at this point &#8230; is simply to keep track of all our moves, even those that deal with the very production of the account. This is neither for the sake of epistemic reflexivity nor for some narcissist indulgence into one’s own work, but because from now on <strong>everything is data</strong>: everything from the first telephone call to a prospective interviewee, the first appointment with the advisor, the first corrections made by a client on a grant proposal, the first launching of a search engine, the first list of boxes to tick in a questionnaire. In keeping with the logic of our interest in textual reports and accounting, it might be useful to list the different notebooks one should keep—manual or digital, it no longer matters much. p. 286.</p></blockquote>
<p>&#8230; and that this is the work of &#8220;slowciology&#8221;  &#8212; it requires you to slow down, and really describe/dig into things.</p>
<p>The other really interesting thing about this book for me was the insistence that social actors do not need to be human. It is fairly typical for social science research to focus on face-to-face interaction between people as the primary focus. Latour doesn&#8217;t dispute the importance of studying human actors, but emphasizes that it&#8217;s useful to increase the number of actors under study by studying objects (mediators) as actors. Typically we think of actors as having agency, free will, etc &#8230; but objects are typically complex things, with particular affordances, and extensive relations with other things in the field. You get only a very limited view of what is going on if you don&#8217;t trace these relations. </p>
<blockquote><p>Things, quasi-objects, and attachments are the real center of the social world, not the agent, person, member, or participant—nor is it society or its avatars. (p. 237)</p></blockquote>
<p>As a software developer, I really identified with Latour&#8217;s insistence on the role that objects play in our understanding of activities around us; how this view necessarily complicates things a great deal, and requires us to slow down to really understand/describe what is going on. It is hard work. And it&#8217;s only when we understand the various actors and their relations, the actual ones, not the abstract ones in the architecture diagram, or in the theory about the software, that we will be in a position to effectively change things or build anew. </p>
]]></content:encoded>
			<wfw:commentRss>http://inkdroid.org/journal/2013/05/02/everything-is-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>#75</title>
		<link>http://inkdroid.org/journal/2013/04/18/75/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=75</link>
		<comments>http://inkdroid.org/journal/2013/04/18/75/#comments</comments>
		<pubDate>Thu, 18 Apr 2013 17:24:19 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[politics]]></category>

		<guid isPermaLink="false">http://inkdroid.org/journal/?p=5579</guid>
		<description><![CDATA[When taxes are too high, people go hungry. When the government is too intrusive, people lose their spirit. Act for the people&#8217;s benefit. Trust them; leave them alone. Tao Te Ching #75]]></description>
				<content:encoded><![CDATA[<blockquote><p>
When taxes are too high,<br />
people go hungry.<br />
When the government is too intrusive,<br />
people lose their spirit.</p>
<p>Act for the people&#8217;s benefit.<br />
Trust them; leave them alone.</p>
<p><cite><a href="http://academic.brooklyn.cuny.edu/core9/phalsall/texts/taote-v3.html#75">Tao Te Ching #75</a><br />
</cite></p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://inkdroid.org/journal/2013/04/18/75/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>python heal thyself</title>
		<link>http://inkdroid.org/journal/2013/03/22/python-heal-thyself/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=python-heal-thyself</link>
		<comments>http://inkdroid.org/journal/2013/03/22/python-heal-thyself/#comments</comments>
		<pubDate>Fri, 22 Mar 2013 13:26:40 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[web]]></category>
		<category><![CDATA[d3]]></category>
		<category><![CDATA[gender]]></category>
		<category><![CDATA[pycon]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[tagcloud]]></category>
		<category><![CDATA[twitter]]></category>
		<category><![CDATA[wordclouds]]></category>

		<guid isPermaLink="false">http://inkdroid.org/journal/?p=5552</guid>
		<description><![CDATA[.@adriarichards is currently getting doxed &#38; threatened w/ violence. Search Twitter for her name &#38; report abuse: bit.ly/Y82Ntx &#8212; Gina Trapani (@ginatrapani) March 21, 2013 After seeing Gina&#8217;s tweet, I was curious to see if there was any difference by gender in the tweets directed at @adriarichards over the recent controversy at PyCon. I wasn&#8217;t [...]]]></description>
				<content:encoded><![CDATA[<blockquote class="twitter-tweet" width="550"><p>.@<a href="https://twitter.com/adriarichards">adriarichards</a> is currently getting doxed &amp; threatened w/ violence. Search Twitter for her name &amp; report abuse: <a href="http://t.co/GO6Gc1jXoC" title="http://bit.ly/Y82Ntx">bit.ly/Y82Ntx</a></p>
<p>&mdash; Gina Trapani (@ginatrapani) <a href="https://twitter.com/ginatrapani/status/314552254592069632">March 21, 2013</a></p></blockquote>
<p><script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script></p>
<p>After seeing Gina&#8217;s tweet, I was curious to see if there was any difference by gender in the tweets directed at <a href="http://twitter.com/adriarichards">@adriarichards</a> over the recent <a href="https://news.ycombinator.com/item?id=5391667">controversy</a> at PyCon. I wasn&#8217;t confident I would find anything. It was more a feeble attempt to try to make Python make sense of something senseless that happened at PyCon; or to paraphrase Physician, heal thyself&#8230;for Python to heal itself.</p>
<p>I used <a href="http://github.com/edsu/twarc">twarc</a> to collect 13,472 tweets that mentioned @adriarichards from the search API. I then added a <a href="https://github.com/edsu/twarc/blob/master/utils/gender.py">utility filter</a> that uses <a href="https://github.com/bmuller/genderator">genderator</a> to filter the line oriented JSON based on a guess at the gender (Twitter doesn&#8217;t track it). genderator identified 2,433 (18%) tweets from women, 5,268 (39%) from men, and 5,771 (42%) that were of unknown gender. I then added another <a href="https://github.com/edsu/twarc/blob/master/utils/wordcloud.py">utility</a> that reads a stream of Tweets and generates a tag cloud as a standalone HTML file using <a href="https://github.com/jasondavies/d3-cloud">d3-cloud</a>.</p>
<p>I put them all together on the command line like this:</p>
<pre>
% twarc.py @adriarichards
% cat @adriarichards-20130321200320.json | utils/gender.py --gender male | utils/wordcloud.py > male.html
% cat @adriarichards-20130321200320.json | utils/gender.py --gender female | utils/wordcloud.py > female.html
</pre>
<p>I realize word clouds <a href="http://www.niemanlab.org/2011/10/word-clouds-considered-harmful/">aren&#8217;t probably the greatest</a> way to visualize the differences in these messages. If you have better ideas let me know. I made the <a href="http://inkdroid.org/data/adriarichards.json.gz" rel="nofollow">tweet JSON</a> available if you want to try your own visualization.</p>
<p><a href="http://inkdroid.org/data/adriarichards-male.html"><img src="http://inkdroid.org/images/adriarichards-male.png"/></a><br />
<a href="http://inkdroid.org/data/adriarichards-female.html"><img src="http://inkdroid.org/images/adriarichards-female.png"/></a></p>
<p>Looking at these didn&#8217;t yield much insight. So instead of visualizing all the words that each gender used, I wondered what the clouds would look like if I limited them to words that were uniquely spoken by each gender. In other words, what words did males use in their tweets which were not used by females, and vice-versa. There were 1,333 (11%) uniquely female words, and 4,767 (39%) uniquely male words, with a shared vocabulary of 5,988 (50%) words. </p>
<p><a href="http://inkdroid.org/data/adriarichards-male-unique.html"><img src="http://inkdroid.org/images/adriarichards-male-unique.png"/></a><br />
<a href="http://inkdroid.org/data/adriarichards-female.html"><img src="http://inkdroid.org/images/adriarichards-female-unique.png"/></a></p>
<p>I&#8217;m not sure there is much more insight here either. I guess there is some weak comfort in the knowledge that 1/2 of the words used in these tweets were shared by both sexes.</p>
]]></content:encoded>
			<wfw:commentRss>http://inkdroid.org/journal/2013/03/22/python-heal-thyself/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>emoji dick and mo tweets</title>
		<link>http://inkdroid.org/journal/2013/02/25/emoji-dick-and-mo-tweets/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=emoji-dick-and-mo-tweets</link>
		<comments>http://inkdroid.org/journal/2013/02/25/emoji-dick-and-mo-tweets/#comments</comments>
		<pubDate>Mon, 25 Feb 2013 19:39:44 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[web]]></category>
		<category><![CDATA[emoji]]></category>
		<category><![CDATA[herman melville]]></category>
		<category><![CDATA[humor]]></category>
		<category><![CDATA[library of congress]]></category>
		<category><![CDATA[twitter]]></category>

		<guid isPermaLink="false">http://inkdroid.org/journal/?p=5536</guid>
		<description><![CDATA[The news about Emoji Dick (the version of Moby Dick translated into Emoji) being acquired by the Library of Congress prompted me to capriciously go to Twitter Search to see who was talking about it. As I drilled backwards I was surprised to see the search results went back to Fred Benenson&#8217;s original Tweet about [...]]]></description>
				<content:encoded><![CDATA[<p>The <a href="http://blogs.loc.gov/loc/2013/02/a-whale-of-an-acquisition/">news</a> about <a href="http://lccn.loc.gov/2012454709">Emoji Dick</a> (the version of Moby Dick translated into Emoji) being acquired by the Library of Congress prompted me to capriciously go to Twitter Search to see <a href="https://twitter.com/search?q=emoji%20dick">who was talking about it</a>. As I drilled backwards I was surprised to see the search results went back to Fred Benenson&#8217;s original Tweet about the project.</p>
<blockquote class="twitter-tweet" width="550"><p>I am paying 50 cents a sentence to convertfrom Herman Melville&#8217;s Moby Dick into Emoji on Amazon&#8217;s Mechanical Turk: <a href="http://ping.fm/1cVXy">http://ping.fm/1cVXy</a></p>
<p>&mdash; Fred Benenson (@fredbenenson) <a href="https://twitter.com/fredbenenson/status/1195751643">February 10, 2009</a></p></blockquote>
<p><script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script></p>
<p><em>That Tweet is from 4 years ago!</em></p>
<p>Up until <a href="http://blog.twitter.com/2013/02/now-showing-older-tweets-in-search.html">recently</a> you could only search back a couple of weeks, tops. The only sad thing is that the <a href="https://dev.twitter.com/docs/api/1.1/get/search/tweets">Twitter Search API</a> still seems to have the two week window. I used my little <a href="http://github.com/edsu/twarc">twarc</a> utility to drill back in the search results via the API and the earliest it was able to find for the same query was from 2013-02-18.</p>
<p>Hopefully the search window for the API will be opened up at some point, since it is at least theoretically possible now. If you happen to know any of the details about how the search functionality works I would be most grateful to hear from you.</p>
<p>Oh, and of course, I had to request Emoji Dick from the stacks:</p>
<pre>
PLEASE DO NOT REPLY TO THIS MESSAGE.
 
STATUS: Your request has been received.
REQUEST ID: 243106235
SEND TO: Adams Charge Station (LA 5244) - Staff
REQUEST RECEIVED: Mon Feb 25 12:56:19 EST 2013
TITLE: Emoji Dick ; or The Whale / by Herman Melville ; Edited and Compiled by Fred Benenson ; Translation by Amazon Mechanical Turk. 
AUTHOR: Melville, Herman, 1819-1891. 
CALL#: PS2384 .M6 2012
</pre>
<p>The one-time-cataloger in me thinks that there was a missed opportunity to add a <a href="http://en.wikipedia.org/wiki/Uniform_title">uniform title</a> to the <a href="http://lccn.loc.gov/2012454709">LC catalog record</a>&#8230;. But the title statement of responsibility mentioning that it is a translation made by Amazon Turk more than makes up for that!</p>
<p><em>Thanks <a href="http://twitter.com/lbjay">Jay</a> for letting me know what is going on at my own place of work.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://inkdroid.org/journal/2013/02/25/emoji-dick-and-mo-tweets/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>brief note on Ernst</title>
		<link>http://inkdroid.org/journal/2013/02/01/brief-note-on-ernst/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=brief-note-on-ernst</link>
		<comments>http://inkdroid.org/journal/2013/02/01/brief-note-on-ernst/#comments</comments>
		<pubDate>Fri, 01 Feb 2013 13:33:38 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[archives]]></category>
		<category><![CDATA[books]]></category>
		<category><![CDATA[internet]]></category>
		<category><![CDATA[storage]]></category>
		<category><![CDATA[web]]></category>
		<category><![CDATA[Wolfgang Ernst]]></category>

		<guid isPermaLink="false">http://inkdroid.org/journal/?p=5521</guid>
		<description><![CDATA[Although the traditional archive used to be a rather static memory, the notion of the archive in Internet communication tends to move the archive toward an economy of circulation: permanent tranformations and updating. The so-called cyberspace is not primarily about memory as cultural record but rather about a performantive form of memory as communication. Within [...]]]></description>
				<content:encoded><![CDATA[<blockquote><p>
Although the traditional archive used to be a rather static memory, the notion of the archive in Internet communication tends to move the archive toward an economy of circulation: permanent tranformations and updating. The so-called cyberspace is not primarily about memory as cultural record but rather about a performantive form of memory as communication. Within this economy of permanent recycling of information, there is less need for emphatic but short-term, updatable memory, which comes close to the operative storage management in the von Neumann architecture of computing. <strong>Repositories are no longer final destinations but turn into frequently accessed sites.</strong> Archives become cybernetic systems. The aesthetics of fixed order is being replaced by permanent reconfigurability.</p>
<p><a href="http://de.wikipedia.org/wiki/Wolfgang_Ernst_(Medienwissenschaftler)">Wolfgang Ernst</a>. &#8220;Archives in Transition.&#8221; <a href="http://www.upress.umn.edu/book-division/books/digital-memory-and-the-archive">Digital Memory and the Archive</a>.
</p></blockquote>
<p>I was reading this and remembering Kevin Kelly&#8217;s idea of <a href="http://www.kk.org/thetechnium/archives/2008/12/movage.php">movage</a>, and  the idea of <a href="http://www.ijdc.net/index.php/ijdc/article/view/102">relay supporting archives</a> from Janée et al. I really like the way Ernst works this idea into the way the Internet works, and the ways that the Web transforms the archival function. I&#8217;m only half way through the book, and will likely have more to say when I do, so just taking some notes for myself, carry on&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://inkdroid.org/journal/2013/02/01/brief-note-on-ernst/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>genealogy of a braeburn</title>
		<link>http://inkdroid.org/journal/2013/01/30/genealogy-of-a-braeburn/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=genealogy-of-a-braeburn</link>
		<comments>http://inkdroid.org/journal/2013/01/30/genealogy-of-a-braeburn/#comments</comments>
		<pubDate>Wed, 30 Jan 2013 15:59:26 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[data]]></category>
		<category><![CDATA[wikipedia]]></category>
		<category><![CDATA[apples]]></category>
		<category><![CDATA[freebase]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[knowledge graph]]></category>
		<category><![CDATA[my little pony]]></category>

		<guid isPermaLink="false">http://inkdroid.org/journal/?p=5507</guid>
		<description><![CDATA[It has been observed that when systems break down we get to actually see how they operate. I wonder what this breakage below says about the use of Freebase and Wikipedia data in Google&#8217;s Knowlege Graph. Yes, that&#8217;s an image of Braeburn from My Little Pony to the right, and text about the apple to [...]]]></description>
				<content:encoded><![CDATA[<p>It has been observed that when systems break down we get to actually see how they operate. I wonder what this breakage below says about the <a href="http://www.guardian.co.uk/technology/2013/jan/19/google-search-knowledge-graph-singhal-interview">use</a> of Freebase and Wikipedia data in Google&#8217;s <a href="http://www.google.com/insidesearch/features/search/knowledge.html">Knowlege Graph</a>.</p>
<p><a href="https://www.google.com/#q=braeburn&#038;fp=82bae2c3ce10781c"><img src="http://inkdroid.org/images/braeburn-kg.png"/></a></p>
<p>Yes, that&#8217;s an image of <a href="http://mlp.wikia.com/wiki/Braeburn">Braeburn</a> from My Little Pony to the right, and text about the apple to the left. Interestingly it&#8217;s fine at Wikipedia:</p>
<p><a href="http://en.wikipedia.org/wiki/Braeburn"><img src="http://inkdroid.org/images/braeburn-wp.png"/></a></p>
<p>And it&#8217;s not even there in Freebase (according to a search).</p>
<p><a href="http://www.freebase.com/search?limit=30&#038;start=0&#038;query=braeburn"><img src="http://inkdroid.org/images/braeburn-fb.png"/></a></p>
<p>I don&#8217;t know if this reveals what&#8217;s going on in the flow of entities between Wikipedia, Freebase and Google. But I thought it was interesting. I wonder where to report such an anomaly. Is there a place?</p>
<p>Thanks to <a href="https://plus.google.com/107581973435023382062/posts">Jeff Godin</a> in <a href="irc:irc.feenode.net/code4lib">#code4lib</a> for noticing the breakage in Knowledge Graph.</p>
<p>See also Hilary Mason&#8217;s <a href="http://www.hilarymason.com/blog/im-a-dead-celebrity/">post</a> about how her identity got mixed up on Bing. (Thanks <a href="http://improbable.org/chris">Chris</a>).</p>
<p>Update: 2012-02-04</p>
<p>I thought to check a week later, and the The Knowledge Graph results got even funnier, now it&#8217;s a collage of apples and My Little Pony:</p>
<p><img src="http://inkdroid.org/images/braeburn-kg2.png"/></p>
]]></content:encoded>
			<wfw:commentRss>http://inkdroid.org/journal/2013/01/30/genealogy-of-a-braeburn/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>aaronsw</title>
		<link>http://inkdroid.org/journal/2013/01/19/aaronsw/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=aaronsw</link>
		<comments>http://inkdroid.org/journal/2013/01/19/aaronsw/#comments</comments>
		<pubDate>Sat, 19 Jan 2013 22:07:30 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[libraries]]></category>
		<category><![CDATA[aaronsw]]></category>
		<category><![CDATA[internet archive]]></category>
		<category><![CDATA[twitter]]></category>

		<guid isPermaLink="false">http://inkdroid.org/journal/?p=5460</guid>
		<description><![CDATA[Aaron Swartz left us all a week ago. It&#8217;s strange, I only met Aaron once at the Internet Archive, and had a handful of conversations with him via email/irc &#8230; but not a day has passed since last Saturday that I haven&#8217;t thought about him, and his principled life. I&#8217;ve been asked a few times [...]]]></description>
				<content:encoded><![CDATA[<p>Aaron Swartz left us all a week ago. It&#8217;s strange, I only met Aaron once at the Internet Archive, and had a handful of conversations with him via email/irc &#8230; but not a day has passed since last Saturday that I haven&#8217;t thought about him, and his principled life. </p>
<p>I&#8217;ve been asked a few times why Aaron has been on my mind so much, and I&#8217;ve struggled to put it into words. Meanwhile, so many thoughtful things have been written about him. The arc of his life, his ideals, and abilities, charisma, and chutzpah, seem larger than life. And yet, he was just a person, a son, a friend, with people who loved him. It&#8217;s just heartbreaking.</p>
<hr />
<p>I work as a software developer in libraryland, trying to bridge the world of information we&#8217;ve had with the world we are building on the Web. So for me, Aaron was a role model, a teacher whose lessons weren&#8217;t in textbooks or scholarly journals, but in his blog, in his code, in his talks, in his experiments with real world results. He was only 26 when he died, but he was, and remains, as Tim Berners-Lee paradoxically <a href="http://lists.w3.org/Archives/Public/www-tag/2013Jan/0017.html">called him</a>, a &#8220;wise elder&#8221;.</p>
<p>I wanted to write something here, but more than that I wanted to do something.</p>
<hr />
<p>I noticed that <a href="http://archive.org/">Internet Archive</a> created a <a href="http://archive.org/details/aaronsw">collection</a> devoted to online material related to Aaron, and thought I would try to collect together all the Twitter conversations that mention him. Twitter&#8217;s search is limited to the last week, so I quickly wrote a <a href="http://github.com/edsu/twarc">command line utility</a> that pages through search results using their <a href="https://dev.twitter.com/docs/api/1.1/get/search/tweets">API</a>, and writes out the complete data as line-oriented JSON. I also pulled in the tweets that mention #pdftribute since they were largely inspired by Aaron&#8217;s efforts in the open access space. I packaged up the data using <a href="http://en.wikipedia.org/wiki/BagIt">BagIt</a> and <a href="http://archive.org/details/AaronswRelatedTweets">put it up</a> at Internet Archive. Here&#8217;s the description from the bag-info.txt</p>
<blockquote><p>
  On January 11, 2013 the Internet activist Aaron Swartz took his own life, and a great deal of grief, anger, and constructive thinking erupted on the Web and in Twitter. In particular the #pdftribute Twitter tag was born, in an attempt to raise awareness about Open Access issues, that Aaron did so much to futher during his life.</p>
<p>This package contains Twitter JSON data for two Twitter search queries that were collected in the week following Aaron&#8217;s death:</p>
<ul>
<li>&#8220;Aaron Swartz&#8221; OR aaronsw</li>
<li>#pdftribute</li>
</ul>
<p>aaronsw.json.gz contains 630,397 tweets, for the period starting with 2013-01-11 16:50:22 and ending 2013-01-18 13:50:02.</p>
<p>pdftribute.json.gz contains 42,277 tweets, for the period starting with Jan 13 02:42:26 and ending Jan 17 03:33:46.</p>
<p>In addition the URLs mentioned in the tweets found in aaronsw.tar.gz were extracted, unshortened, and then aggregated to provide a report of what people linked to. These URLs are available in aaronsw-urls.txt.gz.</p>
<p>  It is hoped that this data will help document the Web community&#8217;s response to Aaron&#8217;s death, and life.
</p></blockquote>
<p>Below is a list of the top 50 links shared in tweets about Aaron. There were 36,506 in all. </p>
<style>
.aligned-table td + td {text-align: right;}
</style>
<table class="aligned-table">
<tr>
<th>Page</th>
<th>Shares</th>
</tr>
<tr>
<td><a href="http://boingboing.net/2013/01/12/rip-aaron-swartz.html">RIP, Aaron Swartz &#8211; Boing Boing</a></td>
<td>11763</td>
</tr>
<tr>
<td><a href="http://unhandled.com/2013/01/12/the-truth-about-aaron-swartzs-crime/">The Truth about Aaron Swartz’s “Crime” « Unhandled Exception</a></td>
<td>6641</td>
</tr>
<tr>
<td><a href="http://tech.mit.edu/V132/N61/swartz.html"> Aaron Swartz commits suicide &#8211; The Tech</a></td>
<td>5539</td>
</tr>
<tr>
<td><a href="https://petitions.whitehouse.gov/petition/remove-united-states-district-attorney-carmen-ortiz-office-overreach-case-aaron-swartz/RQNrG1Ck">Remove United States District Attorney Carmen Ortiz from office for overreach in the case of Aaron Swartz.</a></td>
<td>6478</td>
</tr>
<tr>
<td><a href="http://lessig.tumblr.com/post/40347463044/prosecutor-as-bully">Prosecutor as bully &#8211; Lessig Blog</a></td>
<td>3738</td>
</tr>
<tr>
<td><a href="http://www.guardian.co.uk/commentisfree/2013/jan/12/aaron-swartz-heroism-suicide1"> The inspiring heroism of Aaron Swartz | Glenn Greenwald | Comment is free | guardian.co.uk </a></td>
<td>2522</td>
</tr>
<tr>
<td><a href="http://thinkprogress.org/justice/2013/01/14/1441211/killers-slavers-and-bank-robbers-all-face-less-severe-prison-terms-than-aaron-swartz-did/?mobile=nc">Aaron Swartz Faced A More Severe Prison Term Than Killers, Slave Dealers And Bank Robbers | ThinkProgress</a></td>
<td>2367</td>
</tr>
<tr>
<td><a href="https://www.eff.org/deeplinks/2013/01/farewell-aaron-swartz">Farewell to Aaron Swartz, an Extraordinary Hacker and Activist &#8211; EFF</a></td>
<td>2042</td>
</tr>
<tr>
<td><a href="http://www.nytimes.com/2013/01/13/technology/aaron-swartz-internet-activist-dies-at-26.html">Internet Activist, a Creator of RSS, Is Dead at 26, Apparently a Suicide &#8211; New York Times</a></td>
<td>1927</td>
</tr>
<tr>
<td><a href="http://alt1040.com/2013/01/aaron-swartz"> Aaron Swartz muere por suicidio a sus 26 años</a></td>
<td>1572</td>
</tr>
<tr>
<td><a href="http://mashable.com/2013/01/13/aaron-swartz/">Technology&#8217;s Greatest Minds Say Goodbye to Aaron Swartz</a></td>
<td>1558</td>
</tr>
<tr>
<td><a href="http://alt1040.com/2013/01/aaron-swartz-contribuciones-a-la-red"> Aaron Swartz a través de 5 grandes contribuciones a la red</a></td>
<td>1495</td>
</tr>
<tr>
<td><a href="http://www.washingtonpost.com/blogs/wonkblog/wp/2013/01/12/aaron-swartz-american-hero/">Aaron Swartz, American hero</a></td>
<td>1397</td>
</tr>
<tr>
<td><a href="http://mashable.com/2013/01/12/aaron-swartz-suicide/">Internet Activist Aaron Swartz Commits Suicide</a></td>
<td>1330</td>
</tr>
<tr>
<td><a href="http://news.cnet.com/8301-1023_3-57563752-93/anonymous-hacks-mit-after-aaron-swartzs-suicide/">Anonymous hacks MIT after Aaron Swartz&#8217;s suicide | Internet &#038; Media &#8211; CNET News</a></td>
<td>1327</td>
</tr>
<tr>
<td><a href="http://www.zephoria.org/thoughts/archives/2013/01/13/aaron-swartz.html">danah boyd | apophenia  » processing the loss of Aaron Swartz</a></td>
<td>1280</td>
</tr>
<tr>
<td><a href="http://rememberaaronsw.tumblr.com/post/40372208044/official-statement-from-the-family-and-partner-of-aaron">Official Statement from the family and partner of Aaron Swartz &#8211; Remember Aaron Swartz</a></td>
<td>1199</td>
</tr>
<tr>
<td><a href="http://wilwheaton.net/2012/09/depression-lies/">depression lies | WIL WHEATON dot NET: 2.0</a></td>
<td>1164</td>
</tr>
<tr>
<td><a href="http://www.bbc.co.uk/news/world-us-canada-21001452">BBC News &#8211; Aaron Swartz, internet freedom activist, dies aged 26</a></td>
<td>1143</td>
</tr>
<tr>
<td><a href="https://www.eff.org/deeplinks/2013/01/aaron-swartz-fix-draconian-computer-crime-law">In the Wake of Aaron Swartz&#8217;s Death, Let&#8217;s Fix Draconian Computer Crime Law &#8211; EFF</a></td>
<td>1088</td>
</tr>
<tr>
<td><a href="http://www.huffingtonpost.com/2013/01/15/westboro-baptist-church-aaron-swartz-anonymous_n_2479019.html">Westboro Baptist Church Drops Aaron Swartz Funeral Protest After Anonymous Vows Action (VIDEO)</a></td>
<td>1079</td>
</tr>
<tr>
<td><a href="http://soupsoup.tumblr.com/post/40373383323/official-statement-from-the-family-and-partner-of">Soup • Official Statement from the Family and Partner of&#8230;</a></td>
<td>1067</td>
</tr>
<tr>
<td><a href="http://rt.com/usa/news/aaron-swartz-funeral-chicago-059/">&#8216;Aaron was killed by the government&#8217; &#8211; Robert Swartz on his son&#8217;s death  — RT</a></td>
<td>1066</td>
</tr>
<tr>
<td><a href="http://pdftribute.net/">#PDFTribute list of documents</a></td>
<td>1044</td>
</tr>
<tr>
<td><a href="http://www.cnn.com/2013/01/12/us/new-york-reddit-founder-suicide/index.html">Internet prodigy, activist Aaron Swartz commits suicide &#8211; CNN.com</a></td>
<td>1009</td>
</tr>
<tr>
<td><a href="http://www.thenation.com/blog/172187/aaron-swartz">Remembering Aaron Swartz | The Nation</a></td>
<td>1003</td>
</tr>
<tr>
<td><a href="http://www.aaronsw.com/2002/continuity">If I get hit by a truck&#8230;</a></td>
<td>991</td>
</tr>
<tr>
<td><a href="http://www.lemonde.fr/technologies/article/2013/01/12/suicide-d-aaron-swartz-activiste-a-l-origine-du-format-rss-et-de-reddit_1816246_651865.html">Suicide d&#8217;Aaron Swartz, activiste à l&#8217;origine du format RSS et de Creative Commons</a></td>
<td>938</td>
</tr>
<tr>
<td><a href="http://www.zdnet.com/hacker-activist-aaron-swartz-commits-suicide-7000009725/">Hacker, Activist Aaron Swartz Commits Suicide | ZDNet</a></td>
<td>896</td>
</tr>
<tr>
<td><a href="http://www.forbiddenknowledgetv.com/videos/activism/how-we-stopped-sopa-by-aaron-swartz1986-2013.html">Activism &#8220;How We Stopped SOPA&#8221; by Aaron Swartz (1986-2013)</a></td>
<td>896</td>
</tr>
<tr>
<td><a href="http://tecnologia.elpais.com/tecnologia/2013/01/13/actualidad/1358037094_942870.html">Muere a los 26 años el ciberactivista Aaron Swartz | Tecnología | EL PAÍS</a></td>
<td>887</td>
</tr>
<tr>
<td><a href="http://www.alternet.org/10-awful-crimes-get-you-less-prison-time-what-aaron-swartz-faced">10 Awful Crimes That Get You Less Prison Time Than What Aaron Swartz Faced | Alternet</a></td>
<td>868</td>
</tr>
<tr>
<td><a href="http://www.wired.com/threatlevel/2013/01/aaron-swartz/">Aaron Swartz, Coder and Activist, Dead at 26 | Threat Level | Wired.com</a></td>
<td>856</td>
</tr>
<tr>
<td><a href="http://www.newyorker.com/online/blogs/newsdesk/2013/01/everyone-interesting-is-a-felon.html?mbid=social_retweet">How the Legal System Failed Aaron Swartz&#8211;and Us : The New Yorker</a></td>
<td>849</td>
</tr>
<tr>
<td><a href="https://aaronsw.jottit.com/howtoget">https://aaronsw.jottit.com/howtoget</a></td>
<td>811</td>
</tr>
<tr>
<td><a href="http://www.theatlanticwire.com/national/2013/01/anonymous-westboro-baptist-church-aaron-swartz-funeral/61036/">How Anonymous Got Westboro to Back Off Aaron Swartz&#8217;s Funeral &#8211; National &#8211; The Atlantic Wire</a></td>
<td>804</td>
</tr>
<tr>
<td><a href="http://alt1040.com/2013/01/aaron-swartz-open-data-investigacion"> Muerte de Aaron Swartz: la necesidad del Open Data en el I+D</a></td>
<td>779</td>
</tr>
<tr>
<td><a href="http://rt.com/usa/news/swartz-suicide-court-drops-charges-997/">US court drops charges on Aaron Swartz days after his suicide — RT</a></td>
<td>772</td>
</tr>
<tr>
<td><a href="http://neuroconscience.com/2013/01/13/researchers-begin-posting-article-pdfs-to-twitter-in-pdftribute-to-aaron-swartz/">Researchers begin posting article PDFs to twitter in #pdftribute to Aaron Swartz « Neuroconscience</a></td>
<td>745</td>
</tr>
<tr>
<td><a href="http://www.quinnnorton.com/said/?p=644">  My Aaron Swartz, whom I loved.   | Quinn Said</a></td>
<td>742</td>
</tr>
<tr>
<td><a href="http://www.guardian.co.uk/commentisfree/2013/jan/12/aaron-swartz-heroism-suicide1?CMP=twt_gu"> The inspiring heroism of Aaron Swartz | Glenn Greenwald | Comment is free | guardian.co.uk </a></td>
<td>713</td>
</tr>
<tr>
<td><a href="http://arstechnica.com/tech-policy/2013/01/government-formally-drops-charges-against-aaron-swartz/">Government formally drops charges against Aaron Swartz | Ars Technica</a></td>
<td>708</td>
</tr>
<tr>
<td><a href="http://www.nakedcapitalism.com/2013/01/aaron-swartzs-politics.html">Aaron Swartz’s Politics «  naked capitalism</a></td>
<td>704</td>
</tr>
<tr>
<td><a href="http://www.cnn.com/">CNN.com &#8211; Breaking News, U.S., World, Weather, Entertainment &#038; Video News</a></td>
<td>690</td>
</tr>
<tr>
<td><a href="http://mashable.com/2013/01/15/aaron-swartz-tech-world-depression/">After Aaron Swartz: The Tech World Must Talk About Depression</a></td>
<td>670</td>
</tr>
<tr>
<td><a href="http://aaronsw.archiveteam.org/">JSTOR liberator</a></td>
<td>663</td>
</tr>
<tr>
<td><a href="http://mashable.com/2013/01/12/aaron-swartz-suicide/">Internet Activist Aaron Swartz Commits Suicide</a></td>
<td>661</td>
</tr>
<tr>
<td><a href="http://alt1040.com/2013/01/anonymous-mit-doj-tributo-aaron-swartz"> Anonymous tumba las webs del MIT y DOJ como tributo a Aaron Swartz</a></td>
<td>652</td>
</tr>
<tr>
<td><a href="http://mashable.com/2013/01/14/anonymous-hacks-mit/">Anonymous Hacks MIT, Leaves Farewell Message for Aaron Swartz</a></td>
<td>647</td>
</tr>
</table>
<p>There were 209,839 Twitter users that mentioned Aaron on Twitter in the last week. I was one of them. I wish I could&#8217;ve done more to help.</p>
]]></content:encoded>
			<wfw:commentRss>http://inkdroid.org/journal/2013/01/19/aaronsw/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Fielding notes</title>
		<link>http://inkdroid.org/journal/2013/01/05/fielding-notes/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=fielding-notes</link>
		<comments>http://inkdroid.org/journal/2013/01/05/fielding-notes/#comments</comments>
		<pubDate>Sun, 06 Jan 2013 04:16:51 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[web]]></category>
		<category><![CDATA[architecture]]></category>
		<category><![CDATA[http]]></category>
		<category><![CDATA[interviews]]></category>
		<category><![CDATA[jon udell]]></category>
		<category><![CDATA[rest]]></category>
		<category><![CDATA[roy fielding]]></category>
		<category><![CDATA[uri]]></category>

		<guid isPermaLink="false">http://inkdroid.org/journal/?p=5441</guid>
		<description><![CDATA[a tongue-in-cheek change request from @timberners_leePaul Downey I&#8217;ve been doing a bit of research into the design of the Web for a paper I&#8217;m trying to write. In my travels I ran across Jon Udell&#8217;s 2006 interview with Roy Fielding. The interview is particularly interesting because of Roy&#8217;s telling of how (as a graduate student) [...]]]></description>
				<content:encoded><![CDATA[<div style="float: left; font-size: 8pt; text-align: center; margin-right: 10px;"><a href="http://www.flickr.com/photos/psd/8271699529/"><img src="http://inkdroid.org/images/timbl_roy.jpg"/><br />a tongue-in-cheek change request from @timberners_lee<br />Paul Downey</a></div>
<p>I&#8217;ve been doing a bit of research into the design of the Web for a paper I&#8217;m trying to write. In my travels I ran across Jon Udell&#8217;s <a href="http://jonudell.net/udell/2006-08-25-a-conversation-with-roy-fielding-about-http-rest-webdav-jsr-170-and-waka.html">2006 interview</a> with Roy Fielding. The interview is particularly interesting because of Roy&#8217;s telling of how (as a graduate student) he found himself working on <a href="http://en.wikipedia.org/wiki/Library_for_WWW_in_Perl">libwww-perl</a> which helped him discover the architecture of the Web that was largely documented by Tim Berners-Lee&#8217;s <a href="http://en.wikipedia.org/wiki/Libwww">libwww</a> HTTP library for Objective-C. </p>
<p>For the purposes of note taking, and giving some web spiders some text to index, here are a few moments that stood out:</p>
<blockquote><p>
<strong>Udell</strong>: A little later on [in Roy's dissertation] you talk about how systems based on what you call control messages are in a very different category from systems where the decisions that get made are being made by human beings, and that that&#8217;s, in a sense, the ultimate rationale for designing data driven systems that are web-like, because people need to interact with them in lots of ways that you can&#8217;t declaratively define. </p>
<p><strong>Fielding</strong>: Yeah, it&#8217;s a little bit easier to say that people need to reuse them, in various unanticipated ways. A lot of people think that when they are building an application that they are building something that&#8217;s going to last forever, and almost always that&#8217;s false. Usually when they are building an application the only thing that lasts forever is the data, at least if you&#8217;re lucky. If you&#8217;re lucky the data retains some semblance of archivability, or reusability over time. </p>
<p>&#8230;</p>
<p><strong>Udell</strong>: There is a meme out there to the effect that what we now call REST architectural style was in a sense discovered post facto, as opposed to having been anticipated from the beginning. Do you agree with that or not?</p>
<p><strong>Fielding</strong>: No, it&#8217;s a little bit of everything, in the sense that there are core principles involved that Berners-Lee was aware of when he was working on it. I first talked to Tim about what I was calling the HTTP Object Model at the time, which is a terrible name for it, but we talked when I was at the W3C in the summer of 95, about the software engineering principles. Being a graduate student of software engineering, that was my focus, and my interest originally. Of course all the stuff I was doing for the Web that was just for fun. At the time that was not considered research. </p>
<p><strong>Udell</strong>: But did you at the time think of what you then called the HTTP object model as being in contrast to more API like and procedural approaches?</p>
<p><strong>Fielding</strong>: Oh definitely. The reason for that was that the first thing I did for the Web was statistical analysis software, which turned out to be very effective at helping people understand the value of communicating over the Web. The second thing was a program called MOMSpider. It was one of the first Web spiders, a mechanism for testing all the links that were on the Web.</p>
<p><strong>Udell</strong>: And that was when you also worked on libwww-perl?</p>
<p><strong>Fielding</strong>: Right, and &#8230; at the time it was only the second protocol library available for the Web. It was a combination of pieces from various sources, as well as a lot of my own work, in terms of filling out the details, and providing an overall view of what a Web client should do with an HTTP library. And as a result of that design process I realized some of the things Tim Berners-Lee had designed into the system. And I also found a whole bunch of cases where the design didn&#8217;t make any sense, or the way it had been particularly implemented over at NCSA, or one of the other clients, or various history of the Web had turned out to be not-fitting with the rest of the design. So that led to a lot of discussions with the other early protocol developers particularly people like Rob McCool, Tony Sanders and Ari Luotonen&#8211;people who were building their own systems and understood both what they were doing with the Web, and also what complaints they were getting from their users. And from that I distilled a model of basically what was the core of HTTP. Because if you look back in the 93/94 time frame, the HTTP specification did not look all that similar to what it does now. It had a whole range of methods that were never used, and a lot of talk about various aspects of object orientation which never really applied to HTTP. And all of that came out of Tim&#8217;s original implementation of libwww, which was an Objective-C implementation that was trying to be as portable as possible. It had a lot of the good principles of interface separation and genericity inside the library, and really the same principles that I ended up using in the Perl library, although they were completely independently developed. It was just one of those things where that kind of interaction has a way of leading to a more extensible design.</p>
<p><strong>Udell</strong>: So was focusing down on a smaller set of verbs partly driven by the experience of having people starting to use the Web, and starting to experience what URLs could be in a human context as well as in a programmatic context?</p>
<p><strong>Fielding</strong>: Well, that was really a combination of things. One that&#8217;s a fairly common paradigm: if you are trying to inter-operate with people you&#8217;ve never met, try to keep it as simple as possible. There&#8217;s also just inherent in the notion of using URIs to identify everything, which is of course really the basis of what the Web is, provides you with that frame of mind where you have a common resource, and you want to have a common resource interface.
</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://inkdroid.org/journal/2013/01/05/fielding-notes/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>spotify vs rdio 2012</title>
		<link>http://inkdroid.org/journal/2013/01/02/spotify-vs-rdio-2012/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=spotify-vs-rdio-2012</link>
		<comments>http://inkdroid.org/journal/2013/01/02/spotify-vs-rdio-2012/#comments</comments>
		<pubDate>Wed, 02 Jan 2013 15:33:31 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[music]]></category>

		<guid isPermaLink="false">http://inkdroid.org/journal/?p=5427</guid>
		<description><![CDATA[Back in August of 2011 I wrote a little utility that pulled down Alf Eaton&#8217;s Album of the Year data. AOTY is nice for two reasons: a) I like Alf&#8217;s taste in music, so the lists are relevant to me; and b) AOTY is a nice example of layering structured metadata into HTML, for easy [...]]]></description>
				<content:encoded><![CDATA[<p><a href="http://inkdroid.org/journal/2011/08/06/spotify-rdio-and-albums-of-the-year/">Back</a> in August of 2011 I wrote a little <a href="https://github.com/edsu/aotycmp/">utility</a> that pulled down Alf Eaton&#8217;s <a href="http://apps.hubmed.org/aoty/">Album of the Year</a> data. AOTY is nice for two reasons: a) I like Alf&#8217;s taste in music, so the lists are relevant to me; and b) AOTY is a nice example of layering structured metadata into HTML, for easy processing (aka scraping). With the data in hand it was easy to to check to see if the albums were available on the streaming services <a href="http://www.spotify.com/">Spotify</a> and <a href="http://rdio.com">Rdio</a> using their respective APIs. I was trying to decide which one to use at the time, and wanted to know if there was any significant difference in their catalogs. </p>
<p>Back then, it looked like 32% of the albums were available on Spotify, and 46% on Rdio. Alf has updated his list for <a href="http://apps.hubmed.org/aoty/2012">2012</a> so I decided to rerun aotycmp, and it appears that coverage of both has improved, with Spotify (41%) closing the gap a bit closer with Rdio (49%) which still has a comfortable lead. If you want the availability data I&#8217;ve updated it on <a href="https://raw.github.com/edsu/aotycmp/master/aoty_cmp.csv">Github</a>.</p>
<p>I&#8217;ve been very happy with Rdio, although pieces like <a href="http://pitchfork.com/features/articles/8993-the-cloud/">Damon Krukowski&#8217;s</a> (thanks <a href="http://twitter.com/dchud">@dchud</a>) make me wish there was a better way to a) stream music while b) actually putting money in the artists pockets. I&#8217;d love to have the ability to pay a little bit more if I knew it was going to the help support the artist in creating more of their art.</p>
]]></content:encoded>
			<wfw:commentRss>http://inkdroid.org/journal/2013/01/02/spotify-vs-rdio-2012/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Darth Nader</title>
		<link>http://inkdroid.org/journal/2013/01/02/darth-nader/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=darth-nader</link>
		<comments>http://inkdroid.org/journal/2013/01/02/darth-nader/#comments</comments>
		<pubDate>Wed, 02 Jan 2013 11:06:41 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[dreams]]></category>
		<category><![CDATA[coffee]]></category>
		<category><![CDATA[darth vader]]></category>
		<category><![CDATA[librarybox]]></category>
		<category><![CDATA[occupy]]></category>
		<category><![CDATA[ralph nader]]></category>

		<guid isPermaLink="false">http://inkdroid.org/journal/?p=5402</guid>
		<description><![CDATA[This may be a bad/shortlived idea, but as part of a New Year&#8217;s resolution to write more varied material I&#8217;m going to try to use my blog (partly) as a dream journal. This will probably drive the few readers I have away, but I&#8217;m hoping it might provide some amusement. I barely remember my dreams [...]]]></description>
				<content:encoded><![CDATA[<p><em>This may be a bad/shortlived idea, but as part of a New Year&#8217;s resolution to write more varied material I&#8217;m going to try to use my blog (partly) as a dream journal. This will probably drive the few readers I have away, but I&#8217;m hoping it might provide some amusement. I barely remember my dreams these days, and would like to remember more of them, so here goes. Feel free to file under TMI.</em></p>
<p>Walking into a cafe/restaurant in the morning, in what feels like New York, but I&#8217;m not sure&#8230;it could be any city. It&#8217;s a cosy, narrow setup, with all the seats taken by people quietly chatting. I manage to get a cup of coffee to go, and stand waiting for a table to open up. I discover a staircase and vaguely remember that there is seating upstairs. I go up the stairs carefully balancing my wide bowl-like cup of coffee.</p>
<p>The upstairs area is quite large and sprawling, dimly lit, with comfortable chairs, wider tables, and in the middle is a life sized sculpture of a woman in motion, looking behind, while walking&#8211;who apparently is the owner of the establishment. A hostess shows me to a table nearby, and says she can&#8217;t remember the name of the server, but that someone would be with me shortly. I sit down with my coffee. </p>
<p>After just a few minutes I notice that it feels like evening. There are lots of conversations going on nearby, which I&#8217;m able to hear fairly easily. One man in his early 30s is standing at his table, and in a kind of spotlight. He is talking quietly, as if on stage, not obviously on a cell phone, about a meeting that he has just had, and how they will need to travel to Austin, Texas to help protect some geographic area. I can&#8217;t remember the exact details of what he was saying but it is clear he is working for an organization that is trying to save some ecosystem features in Austin. </p>
<p>There is a bookshelf nearby with a disembodied head on it, which looks like Ralph Nader, and also a bit like Darth Vader when Luke takes his helmet off at the end of Return of the Jedi. The head is animated, and seems to be simulating the other half of the conversation. He is saying that this is important work, and is similar to a recent project in Seattle. The conversation ends, and the man walks out of the coffee shop. </p>
<p>I notice three other people, with big thick, Ginsbergian beards also leave their tables at the same time, deep in conversation, about something different. There is a counter-culture, occupy-like feeling in the air, of people steadily working to make there corner of the world a better place, it&#8217;s a good feeling.</p>
<h2>Afterword</h2>
<p>Half awake I found myself thinking about the talking head, and how it reminded me of <a href="http://librarybox.us/">LibraryBox</a>. It was as if the head made it possible to easily tune into public conversations that were going on in the local context of the coffee shop&#8230;and it served as an archive or store of these conversations for others to discover later. I don&#8217;t know if LibraryBox actually lets any of that happen, but it&#8217;s something I&#8217;ve been meaning to learn more about in the new year.</p>
<p><em>By the way, dream interpretations as comments are most welcome&#8230;</em></p>
]]></content:encoded>
			<wfw:commentRss>http://inkdroid.org/journal/2013/01/02/darth-nader/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>archiving tweets</title>
		<link>http://inkdroid.org/journal/2012/12/31/archiving-tweets/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=archiving-tweets</link>
		<comments>http://inkdroid.org/journal/2012/12/31/archiving-tweets/#comments</comments>
		<pubDate>Mon, 31 Dec 2012 19:42:48 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[archives]]></category>
		<category><![CDATA[web]]></category>
		<category><![CDATA[css]]></category>
		<category><![CDATA[html]]></category>
		<category><![CDATA[javascript]]></category>
		<category><![CDATA[twitter]]></category>

		<guid isPermaLink="false">http://inkdroid.org/journal/?p=5364</guid>
		<description><![CDATA[If you are an active Twitter user you may have heard that you can now download your complete archive of tweets. The functionality is still being rolled out across the millions of accounts, so don&#8217;t be surprised if you don&#8217;t see the function yet in your settings. The WSJ piece kind of joked about the [...]]]></description>
				<content:encoded><![CDATA[<p>If you are an active Twitter user you may have <a href="http://blogs.wsj.com/digits/2012/12/19/how-to-download-your-entire-twitter-history-and-laugh-at-how-bad-it-is/?mod=e2tw">heard</a> that you can now download your complete archive of tweets. The functionality is still being rolled out across the millions of accounts, so don&#8217;t be surprised if you don&#8217;t see the function yet in your settings.</p>
<p>The WSJ piece kind of joked about the importance of this move on Twitter&#8217;s part, which is a bit unfortunate, since it&#8217;s a pretty important issue. Yes you can use a 3rd party apps for downloading your Twitter data, but it says a lot when a company takes &#8220;archiving&#8221; seriously enough to offer it as a service to its users.</p>
<p>If you work in the digital preservation space it&#8217;s kind of fun to take a look at the way that Twitter makes these personal archives available. Luckily (if you don&#8217;t have the archive download button yet like me) <a href="http://davewiner.com/">Dave Winer</a> has started <a href="http://threads2.scripting.com/2012/december/uploadYourTwitterArchive">collecting</a> some archives, and making them publicly available for browsing and download off of S3. For example we can look at <a href="http:/twitter.com/sarahebourne">Sarah Bourne&#8217;s</a> (who tipped me off to Dave&#8217;s work&#8211;thanks Sarah!).  After you&#8217;ve downloaded the ZIP file you get a directory that looks like:</p>
<pre>
sarahebourne/
|-- css
|   `-- application.min.css
|-- data
|   |-- csv
|   |   |-- 2008_08.csv
|   |   |-- 2008_09.csv
|   |   |-- 2008_10.csv
|   |   |-- 2008_11.csv
|   |   |-- 2008_12.csv
|   |   |-- 2009_01.csv
|   |   |-- 2009_02.csv
|   |   |-- 2009_03.csv
|   |   |-- 2009_04.csv
|   |   |-- 2009_05.csv
|   |   |-- 2009_06.csv
|   |   |-- 2009_07.csv
|   |   |-- 2009_08.csv
|   |   |-- 2009_09.csv
|   |   |-- 2009_10.csv
|   |   |-- 2009_11.csv
|   |   |-- 2009_12.csv
|   |   |-- 2010_01.csv
|   |   |-- 2010_02.csv
|   |   |-- 2010_03.csv
|   |   |-- 2010_04.csv
|   |   |-- 2010_05.csv
|   |   |-- 2010_06.csv
|   |   |-- 2010_07.csv
|   |   |-- 2010_08.csv
|   |   |-- 2010_09.csv
|   |   |-- 2010_10.csv
|   |   |-- 2010_11.csv
|   |   |-- 2010_12.csv
|   |   |-- 2011_01.csv
|   |   |-- 2011_02.csv
|   |   |-- 2011_03.csv
|   |   |-- 2011_04.csv
|   |   |-- 2011_05.csv
|   |   |-- 2011_06.csv
|   |   |-- 2011_07.csv
|   |   |-- 2011_08.csv
|   |   |-- 2011_09.csv
|   |   |-- 2011_10.csv
|   |   |-- 2011_11.csv
|   |   |-- 2011_12.csv
|   |   |-- 2012_01.csv
|   |   |-- 2012_02.csv
|   |   |-- 2012_03.csv
|   |   |-- 2012_04.csv
|   |   |-- 2012_05.csv
|   |   |-- 2012_06.csv
|   |   |-- 2012_07.csv
|   |   |-- 2012_08.csv
|   |   |-- 2012_09.csv
|   |   |-- 2012_10.csv
|   |   |-- 2012_11.csv
|   |   `-- 2012_12.csv
|   `-- js
|       |-- payload_details.js
|       |-- tweet_index.js
|       |-- tweets
|       |   |-- 2008_08.js
|       |   |-- 2008_09.js
|       |   |-- 2008_10.js
|       |   |-- 2008_11.js
|       |   |-- 2008_12.js
|       |   |-- 2009_01.js
|       |   |-- 2009_02.js
|       |   |-- 2009_03.js
|       |   |-- 2009_04.js
|       |   |-- 2009_05.js
|       |   |-- 2009_06.js
|       |   |-- 2009_07.js
|       |   |-- 2009_08.js
|       |   |-- 2009_09.js
|       |   |-- 2009_10.js
|       |   |-- 2009_11.js
|       |   |-- 2009_12.js
|       |   |-- 2010_01.js
|       |   |-- 2010_02.js
|       |   |-- 2010_03.js
|       |   |-- 2010_04.js
|       |   |-- 2010_05.js
|       |   |-- 2010_06.js
|       |   |-- 2010_07.js
|       |   |-- 2010_08.js
|       |   |-- 2010_09.js
|       |   |-- 2010_10.js
|       |   |-- 2010_11.js
|       |   |-- 2010_12.js
|       |   |-- 2011_01.js
|       |   |-- 2011_02.js
|       |   |-- 2011_03.js
|       |   |-- 2011_04.js
|       |   |-- 2011_05.js
|       |   |-- 2011_06.js
|       |   |-- 2011_07.js
|       |   |-- 2011_08.js
|       |   |-- 2011_09.js
|       |   |-- 2011_10.js
|       |   |-- 2011_11.js
|       |   |-- 2011_12.js
|       |   |-- 2012_01.js
|       |   |-- 2012_02.js
|       |   |-- 2012_03.js
|       |   |-- 2012_04.js
|       |   |-- 2012_05.js
|       |   |-- 2012_06.js
|       |   |-- 2012_07.js
|       |   |-- 2012_08.js
|       |   |-- 2012_09.js
|       |   |-- 2012_10.js
|       |   |-- 2012_11.js
|       |   `-- 2012_12.js
|       `-- user_details.js
|-- img
|   |-- bg.png
|   `-- sprite.png
|-- index.html
|-- js
|   `-- application.min.js
|-- lib
|   |-- bootstrap
|   |   |-- bootstrap-dropdown.js
|   |   |-- bootstrap.min.css
|   |   |-- bootstrap-modal.js
|   |   |-- bootstrap-tooltip.js
|   |   |-- bootstrap-transition.js
|   |   |-- glyphicons-halflings.png
|   |   `-- glyphicons-halflings-white.png
|   |-- hogan
|   |   `-- hogan-2.0.0.min.js
|   |-- jquery
|   |   `-- jquery-1.8.3.min.js
|   |-- twt
|   |   |-- sprite.png
|   |   |-- sprite.rtl.png
|   |   |-- twt.all.min.js
|   |   `-- twt.min.css
|   `-- underscore
|       `-- underscore-min.js
`-- README.txt
</pre>
<p>So why is this interesting?</p>
<h2>The Data</h2>
<p>The archive includes data both as CSV and as JavaScript. The CSV is perfect for throwing into a spreadsheet, and doing stuff with it there. The JavaScript is actually a very light shim over some JSON data that is quite a bit richer than the CSV. The JavaScript shim is needed so that it can be used by the app that comes in the archive (more on that later). For example here&#8217;s a randomly picked tweet from Sarah:</p>
<blockquote class="twitter-tweet" width="550"><p>@<a href="https://twitter.com/monkchips">monkchips</a> Ouch. Some regrets are harsher than others.</p>
<p>&mdash; Sarah Bourne (@sarahebourne) <a href="https://twitter.com/sarahebourne/status/281405942321532929">December 19, 2012</a></p></blockquote>
<p><script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script></p>
<p>Here is how the Tweet shows up in the CSV:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="csv" style="font-family:monospace;">&nbsp;
&quot;tweet_id&quot;,&quot;in_reply_to_status_id&quot;,&quot;in_reply_to_user_id&quot;,&quot;retweeted_status_id&quot;,&quot;retweeted_status_user_id&quot;,&quot;timestamp&quot;,&quot;source&quot;,&quot;text&quot;,&quot;expanded_urls&quot;
&quot;281405942321532929&quot;,&quot;281400879465238529&quot;,&quot;61233&quot;,&quot;&quot;,&quot;&quot;,&quot;2012-12-19 14:29:39 +0000&quot;,&quot;&lt;a href=&quot;&quot;http://janetter.net/&quot;&quot; rel=&quot;&quot;nofollow&quot;&quot;&gt;Janetter&lt;/a&gt;&quot;,&quot;@monkchips Ouch. Some regrets are harsher than others.&quot;</pre></td></tr></table></div>

<p>And here&#8217;s the archived JSON for the Tweet:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="javascript" style="font-family:monospace;"><span style="color: #009900;">&#123;</span>
  <span style="color: #3366CC;">&quot;source&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;&lt;a href=<span style="color: #000099; font-weight: bold;">\&quot;</span>http://janetter.net/<span style="color: #000099; font-weight: bold;">\&quot;</span> rel=<span style="color: #000099; font-weight: bold;">\&quot;</span>nofollow<span style="color: #000099; font-weight: bold;">\&quot;</span>&gt;Janetter&lt;/a&gt;&quot;</span><span style="color: #339933;">,</span>
  <span style="color: #3366CC;">&quot;entities&quot;</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #3366CC;">&quot;user_mentions&quot;</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span> <span style="color: #009900;">&#123;</span>
      <span style="color: #3366CC;">&quot;name&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;James Governor&quot;</span><span style="color: #339933;">,</span>
      <span style="color: #3366CC;">&quot;screen_name&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;monkchips&quot;</span><span style="color: #339933;">,</span>
      <span style="color: #3366CC;">&quot;indices&quot;</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span> <span style="color: #CC0000;">0</span><span style="color: #339933;">,</span> <span style="color: #CC0000;">10</span> <span style="color: #009900;">&#93;</span><span style="color: #339933;">,</span>
      <span style="color: #3366CC;">&quot;id_str&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;61233&quot;</span><span style="color: #339933;">,</span>
      <span style="color: #3366CC;">&quot;id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">61233</span>
    <span style="color: #009900;">&#125;</span> <span style="color: #009900;">&#93;</span><span style="color: #339933;">,</span>
    <span style="color: #3366CC;">&quot;media&quot;</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span> <span style="color: #009900;">&#93;</span><span style="color: #339933;">,</span>
    <span style="color: #3366CC;">&quot;hashtags&quot;</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span> <span style="color: #009900;">&#93;</span><span style="color: #339933;">,</span>
    <span style="color: #3366CC;">&quot;urls&quot;</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span> <span style="color: #009900;">&#93;</span>
  <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
  <span style="color: #3366CC;">&quot;in_reply_to_status_id_str&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;281400879465238529&quot;</span><span style="color: #339933;">,</span>
  <span style="color: #3366CC;">&quot;geo&quot;</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
  <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span>
  <span style="color: #3366CC;">&quot;id_str&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;281405942321532929&quot;</span><span style="color: #339933;">,</span>
  <span style="color: #3366CC;">&quot;in_reply_to_user_id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">61233</span><span style="color: #339933;">,</span>
  <span style="color: #3366CC;">&quot;text&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;@monkchips Ouch. Some regrets are harsher than others.&quot;</span><span style="color: #339933;">,</span>
  <span style="color: #3366CC;">&quot;id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">281405942321532929</span><span style="color: #339933;">,</span>
  <span style="color: #3366CC;">&quot;in_reply_to_status_id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">281400879465238529</span><span style="color: #339933;">,</span>
  <span style="color: #3366CC;">&quot;created_at&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Wed Dec 19 14:29:39 +0000 2012&quot;</span><span style="color: #339933;">,</span>
  <span style="color: #3366CC;">&quot;in_reply_to_screen_name&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;monkchips&quot;</span><span style="color: #339933;">,</span>
  <span style="color: #3366CC;">&quot;in_reply_to_user_id_str&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;61233&quot;</span><span style="color: #339933;">,</span>
  <span style="color: #3366CC;">&quot;user&quot;</span> <span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #3366CC;">&quot;name&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Sarah Bourne&quot;</span><span style="color: #339933;">,</span>
    <span style="color: #3366CC;">&quot;screen_name&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;sarahebourne&quot;</span><span style="color: #339933;">,</span>
    <span style="color: #3366CC;">&quot;protected&quot;</span> <span style="color: #339933;">:</span> <span style="color: #003366; font-weight: bold;">false</span><span style="color: #339933;">,</span>
    <span style="color: #3366CC;">&quot;id_str&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;16010789&quot;</span><span style="color: #339933;">,</span>
    <span style="color: #3366CC;">&quot;profile_image_url_https&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;https://si0.twimg.com/profile_images/638441870/Snapshot-of-sb_normal.jpg&quot;</span><span style="color: #339933;">,</span>
    <span style="color: #3366CC;">&quot;id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #CC0000;">16010789</span><span style="color: #339933;">,</span>
    <span style="color: #3366CC;">&quot;verified&quot;</span> <span style="color: #339933;">:</span> <span style="color: #003366; font-weight: bold;">false</span>
  <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></td></tr></table></div>

<p>So there&#8217;s quite a bit more structured data in the archived JSON including whether geo coordinates, hash tags, urls mentioned, etc. Also, the avatar images are still referenced out on the Web, where they can change, disappear, etc. It&#8217;s also interesting to compare the archived JSON against what you get back the from Twitter API for the same Tweet:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="javascript" style="font-family:monospace;"><span style="color: #009900;">&#123;</span>
  <span style="color: #3366CC;">&quot;user&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #3366CC;">&quot;follow_request_sent&quot;</span><span style="color: #339933;">:</span> <span style="color: #003366; font-weight: bold;">false</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;profile_use_background_image&quot;</span><span style="color: #339933;">:</span> <span style="color: #003366; font-weight: bold;">true</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;default_profile_image&quot;</span><span style="color: #339933;">:</span> <span style="color: #003366; font-weight: bold;">false</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;id&quot;</span><span style="color: #339933;">:</span> <span style="color: #CC0000;">16010789</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;verified&quot;</span><span style="color: #339933;">:</span> <span style="color: #003366; font-weight: bold;">false</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;profile_text_color&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;080C0C&quot;</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;profile_image_url_https&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;https://si0.twimg.com/profile_images/638441870/Snapshot-of-sb_normal.jpg&quot;</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;profile_sidebar_fill_color&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;FCFAEF&quot;</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;entities&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
      <span style="color: #3366CC;">&quot;url&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #3366CC;">&quot;urls&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
          <span style="color: #009900;">&#123;</span>
            <span style="color: #3366CC;">&quot;url&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;http://www.linkedin.com/in/sarahbourne&quot;</span><span style="color: #339933;">,</span> 
            <span style="color: #3366CC;">&quot;indices&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
              <span style="color: #CC0000;">0</span><span style="color: #339933;">,</span> 
              <span style="color: #CC0000;">38</span>
            <span style="color: #009900;">&#93;</span><span style="color: #339933;">,</span> 
            <span style="color: #3366CC;">&quot;expanded_url&quot;</span><span style="color: #339933;">:</span> <span style="color: #003366; font-weight: bold;">null</span>
          <span style="color: #009900;">&#125;</span>
        <span style="color: #009900;">&#93;</span>
      <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span> 
      <span style="color: #3366CC;">&quot;description&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
        <span style="color: #3366CC;">&quot;urls&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span>
      <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;followers_count&quot;</span><span style="color: #339933;">:</span> <span style="color: #CC0000;">2367</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;profile_sidebar_border_color&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;FFFFFF&quot;</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;id_str&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;16010789&quot;</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;profile_background_color&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;DAE0D9&quot;</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;listed_count&quot;</span><span style="color: #339933;">:</span> <span style="color: #CC0000;">331</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;profile_background_image_url_https&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;https://si0.twimg.com/profile_background_images/671143407/8544adf04bc3823d306c7f05efef2351.jpeg&quot;</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;utc_offset&quot;</span><span style="color: #339933;">:</span> <span style="color: #339933;">-</span><span style="color: #CC0000;">18000</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;statuses_count&quot;</span><span style="color: #339933;">:</span> <span style="color: #CC0000;">20090</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;description&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Internet technology strategist, Accessibility and assistive technologies. Views expressed/implied are my own. See my Twitter lists for more interests.&quot;</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;friends_count&quot;</span><span style="color: #339933;">:</span> <span style="color: #CC0000;">784</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;location&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Boston, MA, USA&quot;</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;profile_link_color&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;800326&quot;</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;profile_image_url&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;http://a0.twimg.com/profile_images/638441870/Snapshot-of-sb_normal.jpg&quot;</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;following&quot;</span><span style="color: #339933;">:</span> <span style="color: #003366; font-weight: bold;">true</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;geo_enabled&quot;</span><span style="color: #339933;">:</span> <span style="color: #003366; font-weight: bold;">false</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;profile_banner_url&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;https://si0.twimg.com/profile_banners/16010789/1348096060&quot;</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;profile_background_image_url&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;http://a0.twimg.com/profile_background_images/671143407/8544adf04bc3823d306c7f05efef2351.jpeg&quot;</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;screen_name&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;sarahebourne&quot;</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;lang&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;en&quot;</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;profile_background_tile&quot;</span><span style="color: #339933;">:</span> <span style="color: #003366; font-weight: bold;">true</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;favourites_count&quot;</span><span style="color: #339933;">:</span> <span style="color: #CC0000;">3147</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;name&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Sarah Bourne&quot;</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;notifications&quot;</span><span style="color: #339933;">:</span> <span style="color: #003366; font-weight: bold;">null</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;url&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;http://www.linkedin.com/in/sarahbourne&quot;</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;created_at&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Wed Aug 27 12:24:25 +0000 2008&quot;</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;contributors_enabled&quot;</span><span style="color: #339933;">:</span> <span style="color: #003366; font-weight: bold;">false</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;time_zone&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Eastern Time (US &amp; Canada)&quot;</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;protected&quot;</span><span style="color: #339933;">:</span> <span style="color: #003366; font-weight: bold;">false</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;default_profile&quot;</span><span style="color: #339933;">:</span> <span style="color: #003366; font-weight: bold;">false</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;is_translator&quot;</span><span style="color: #339933;">:</span> <span style="color: #003366; font-weight: bold;">false</span>
  <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span> 
  <span style="color: #3366CC;">&quot;favorited&quot;</span><span style="color: #339933;">:</span> <span style="color: #003366; font-weight: bold;">false</span><span style="color: #339933;">,</span> 
  <span style="color: #3366CC;">&quot;entities&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #3366CC;">&quot;user_mentions&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
      <span style="color: #009900;">&#123;</span>
        <span style="color: #3366CC;">&quot;id&quot;</span><span style="color: #339933;">:</span> <span style="color: #CC0000;">61233</span><span style="color: #339933;">,</span> 
        <span style="color: #3366CC;">&quot;indices&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span>
          <span style="color: #CC0000;">0</span><span style="color: #339933;">,</span> 
          <span style="color: #CC0000;">10</span>
        <span style="color: #009900;">&#93;</span><span style="color: #339933;">,</span> 
        <span style="color: #3366CC;">&quot;id_str&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;61233&quot;</span><span style="color: #339933;">,</span> 
        <span style="color: #3366CC;">&quot;screen_name&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;monkchips&quot;</span><span style="color: #339933;">,</span> 
        <span style="color: #3366CC;">&quot;name&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;James Governor&quot;</span>
      <span style="color: #009900;">&#125;</span>
    <span style="color: #009900;">&#93;</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;hashtags&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span><span style="color: #339933;">,</span> 
    <span style="color: #3366CC;">&quot;urls&quot;</span><span style="color: #339933;">:</span> <span style="color: #009900;">&#91;</span><span style="color: #009900;">&#93;</span>
  <span style="color: #009900;">&#125;</span><span style="color: #339933;">,</span> 
  <span style="color: #3366CC;">&quot;contributors&quot;</span><span style="color: #339933;">:</span> <span style="color: #003366; font-weight: bold;">null</span><span style="color: #339933;">,</span> 
  <span style="color: #3366CC;">&quot;truncated&quot;</span><span style="color: #339933;">:</span> <span style="color: #003366; font-weight: bold;">false</span><span style="color: #339933;">,</span> 
  <span style="color: #3366CC;">&quot;text&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;@monkchips Ouch. Some regrets are harsher than others.&quot;</span><span style="color: #339933;">,</span> 
  <span style="color: #3366CC;">&quot;created_at&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Wed Dec 19 14:29:39 +0000 2012&quot;</span><span style="color: #339933;">,</span> 
  <span style="color: #3366CC;">&quot;retweeted&quot;</span><span style="color: #339933;">:</span> <span style="color: #003366; font-weight: bold;">false</span><span style="color: #339933;">,</span> 
  <span style="color: #3366CC;">&quot;in_reply_to_status_id_str&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;281400879465238529&quot;</span><span style="color: #339933;">,</span> 
  <span style="color: #3366CC;">&quot;coordinates&quot;</span><span style="color: #339933;">:</span> <span style="color: #003366; font-weight: bold;">null</span><span style="color: #339933;">,</span> 
  <span style="color: #3366CC;">&quot;in_reply_to_user_id_str&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;61233&quot;</span><span style="color: #339933;">,</span> 
  <span style="color: #3366CC;">&quot;source&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;&lt;a href=<span style="color: #000099; font-weight: bold;">\&quot;</span>http://janetter.net/<span style="color: #000099; font-weight: bold;">\&quot;</span> rel=<span style="color: #000099; font-weight: bold;">\&quot;</span>nofollow<span style="color: #000099; font-weight: bold;">\&quot;</span>&gt;Janetter&lt;/a&gt;&quot;</span><span style="color: #339933;">,</span> 
  <span style="color: #3366CC;">&quot;in_reply_to_status_id&quot;</span><span style="color: #339933;">:</span> <span style="color: #CC0000;">281400879465238529</span><span style="color: #339933;">,</span> 
  <span style="color: #3366CC;">&quot;in_reply_to_screen_name&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;monkchips&quot;</span><span style="color: #339933;">,</span> 
  <span style="color: #3366CC;">&quot;id_str&quot;</span><span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;281405942321532929&quot;</span><span style="color: #339933;">,</span> 
  <span style="color: #3366CC;">&quot;place&quot;</span><span style="color: #339933;">:</span> <span style="color: #003366; font-weight: bold;">null</span><span style="color: #339933;">,</span> 
  <span style="color: #3366CC;">&quot;retweet_count&quot;</span><span style="color: #339933;">:</span> <span style="color: #CC0000;">0</span><span style="color: #339933;">,</span> 
  <span style="color: #3366CC;">&quot;geo&quot;</span><span style="color: #339933;">:</span> <span style="color: #003366; font-weight: bold;">null</span><span style="color: #339933;">,</span> 
  <span style="color: #3366CC;">&quot;id&quot;</span><span style="color: #339933;">:</span> <span style="color: #CC0000;">281405942321532929</span><span style="color: #339933;">,</span> 
  <span style="color: #3366CC;">&quot;in_reply_to_user_id&quot;</span><span style="color: #339933;">:</span> <span style="color: #CC0000;">61233</span>
<span style="color: #009900;">&#125;</span></pre></td></tr></table></div>

<p>Using <a href="https://npmjs.org/package/json-diff">json-diff</a> it&#8217;s not too difficult to see what the differences are between the archived version and the API version:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="diff" style="font-family:monospace;"> <span style="">&#123;</span>
<span style="color: #00b000;">+  favorited: false</span>
<span style="color: #00b000;">+  contributors: null</span>
<span style="color: #00b000;">+  truncated: false</span>
<span style="color: #00b000;">+  retweeted: false</span>
<span style="color: #00b000;">+  coordinates: null</span>
<span style="color: #00b000;">+  place: null</span>
<span style="color: #00b000;">+  retweet_count: 0</span>
   entities: <span style="">&#123;</span>
<span style="color: #991111;">-    media: <span style="">&#91;</span></span>
<span style="color: #991111;">-    <span style="">&#93;</span></span>
   <span style="">&#125;</span>
<span style="color: #991111;">-  geo: <span style="">&#123;</span></span>
<span style="color: #991111;">-  <span style="">&#125;</span></span>
<span style="color: #00b000;">+  geo: null</span>
   user: <span style="">&#123;</span>
<span style="color: #00b000;">+    follow_request_sent: false</span>
<span style="color: #00b000;">+    profile_use_background_image: true</span>
<span style="color: #00b000;">+    default_profile_image: false</span>
<span style="color: #00b000;">+    profile_text_color: &quot;080C0C&quot;</span>
<span style="color: #00b000;">+    profile_sidebar_fill_color: &quot;FCFAEF&quot;</span>
<span style="color: #00b000;">+    entities: <span style="">&#123;</span></span>
<span style="color: #00b000;">+      url: <span style="">&#123;</span></span>
<span style="color: #00b000;">+        urls: <span style="">&#91;</span></span>
<span style="color: #00b000;">+          <span style="">&#123;</span></span>
<span style="color: #00b000;">+            url: &quot;http://www.linkedin.com/in/sarahbourne&quot;</span>
<span style="color: #00b000;">+            indices: <span style="">&#91;</span></span>
<span style="color: #00b000;">+              0</span>
<span style="color: #00b000;">+              38</span>
<span style="color: #00b000;">+            <span style="">&#93;</span></span>
<span style="color: #00b000;">+            expanded_url: null</span>
<span style="color: #00b000;">+          <span style="">&#125;</span></span>
<span style="color: #00b000;">+        <span style="">&#93;</span></span>
<span style="color: #00b000;">+      <span style="">&#125;</span></span>
<span style="color: #00b000;">+      description: <span style="">&#123;</span></span>
<span style="color: #00b000;">+        urls: <span style="">&#91;</span></span>
<span style="color: #00b000;">+        <span style="">&#93;</span></span>
<span style="color: #00b000;">+      <span style="">&#125;</span></span>
<span style="color: #00b000;">+    <span style="">&#125;</span></span>
<span style="color: #00b000;">+    followers_count: 2367</span>
<span style="color: #00b000;">+    profile_sidebar_border_color: &quot;FFFFFF&quot;</span>
<span style="color: #00b000;">+    profile_background_color: &quot;DAE0D9&quot;</span>
<span style="color: #00b000;">+    listed_count: 331</span>
<span style="color: #00b000;">+    profile_background_image_url_https: &quot;https://si0.twimg.com/profile_background_images/<span style="">671143407</span>/8544adf04bc<span style="color: #440088;">3823d306</span>c7f05efef2351.jpeg&quot;</span>
<span style="color: #00b000;">+    utc_offset: -18000</span>
<span style="color: #00b000;">+    statuses_count: 20090</span>
<span style="color: #00b000;">+    description: &quot;Internet technology strategist, Accessibility and assistive technologies. Views expressed/implied are my own. See my Twitter lists for more interests.&quot;</span>
<span style="color: #00b000;">+    friends_count: 784</span>
<span style="color: #00b000;">+    location: &quot;Boston, MA, USA&quot;</span>
<span style="color: #00b000;">+    profile_link_color: &quot;800326&quot;</span>
<span style="color: #00b000;">+    profile_image_url: &quot;http://a0.twimg.com/profile_images/638441870/Snapshot-of-sb_normal.jpg&quot;</span>
<span style="color: #00b000;">+    following: true</span>
<span style="color: #00b000;">+    geo_enabled: false</span>
<span style="color: #00b000;">+    profile_banner_url: &quot;https://si0.twimg.com/profile_banners/16010789/1348096060&quot;</span>
<span style="color: #00b000;">+    profile_background_image_url: &quot;http://a0.twimg.com/profile_background_images/<span style="">671143407</span>/8544adf04bc<span style="color: #440088;">3823d306</span>c7f05efef2351.jpeg&quot;</span>
<span style="color: #00b000;">+    lang: &quot;en&quot;</span>
<span style="color: #00b000;">+    profile_background_tile: true</span>
<span style="color: #00b000;">+    favourites_count: 3147</span>
<span style="color: #00b000;">+    notifications: null</span>
<span style="color: #00b000;">+    url: &quot;http://www.linkedin.com/in/sarahbourne&quot;</span>
<span style="color: #00b000;">+    created_at: &quot;Wed Aug 27 12:24:25 +0000 2008&quot;</span>
<span style="color: #00b000;">+    contributors_enabled: false</span>
<span style="color: #00b000;">+    time_zone: &quot;Eastern Time <span style="">&#40;</span>US &amp; Canada<span style="">&#41;</span>&quot;</span>
<span style="color: #00b000;">+    default_profile: false</span>
<span style="color: #00b000;">+    is_translator: false</span>
   <span style="">&#125;</span>
 <span style="">&#125;</span></pre></td></tr></table></div>

<p>To be fair some of the user profile information has been normalized in the archive (perhaps to save space for the viewing application) out to a user_details.js file, which looks like:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="javascript" style="font-family:monospace;"><span style="color: #009900;">&#123;</span>
  <span style="color: #3366CC;">&quot;screen_name&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;sarahebourne&quot;</span><span style="color: #339933;">,</span>
  <span style="color: #3366CC;">&quot;location&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Boston, MA, USA&quot;</span><span style="color: #339933;">,</span>
  <span style="color: #3366CC;">&quot;full_name&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Sarah Bourne&quot;</span><span style="color: #339933;">,</span>
  <span style="color: #3366CC;">&quot;bio&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Internet technology strategist, Accessibility and assistive technologies. Views expressed/implied are my own. See my Twitter lists for more interests.&quot;</span><span style="color: #339933;">,</span>
  <span style="color: #3366CC;">&quot;id&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;16010789&quot;</span><span style="color: #339933;">,</span>
  <span style="color: #3366CC;">&quot;created_at&quot;</span> <span style="color: #339933;">:</span> <span style="color: #3366CC;">&quot;Wed Aug 27 12:24:25 +0000 2008&quot;</span>
<span style="color: #009900;">&#125;</span></pre></td></tr></table></div>

<p>Notably missing from this is a homepage for the user, their number of favourites, their number of friends, followers, whether geo is enabled, etc.</p>
<p>All these details aside, Twitter deserves a lot of credit for making the data available as CSV for ease of use, and also as JavaScript for programmatic use.</p>
<h2>The Code</h2>
<p>So the really, really neat thing about the archive is that it comes with a pure HTML, CSS and JavaScript application that you can open locally in your browser and view your archive. It looks pretty, for example <a href="http://static.scripting.com/twitterArchives/bourne/index.html">here</a> is Sarah&#8217;s archive that Dave Winer mounted up on S3. It even has a keyword search across all your tweets, which takes a bit of time (it interactively loads all your tweet JavaScript files mentioned above), but it works. You can zip the data up, give it to someone else, and it all just works. </p>
<p>The archive uses some third party libraries such as jQuery, Underscore, Twitter Bootstrap and Hogan, which all come minified and bundled statically in the archive. The application itself is called Grailbird and comes minified as well. Grailbird loads the static JavaScript (as needed) and displays it. The only network traffic I saw while it was running was fetching avatar images.</p>
<p>Assuming JavaScript backwards compatibility, and browser support for JavaScript, the Twitter archive&#8217;s contextual display for the underlying data could last a long, long time. At least that&#8217;s a possible interpretation based on David Rosenthal&#8217;s <a href="http://blog.dshr.org/2012/10/formats-through-time.html">hypothesis</a> about the Web&#8217;s effect on format obsolescence. I think it&#8217;s safe to say that this app written for the local Web platform is likely last longer than a GUI application written in another language environment. The separation of code and data, and independence from a particular browser implementation are big wins. These are qualities that we all had to fight and work hard for on the Web, and I think it makes sense to re-purpose them here in an archival context.</p>
<p>I doubt anyone from Twitter has read this far, but if someone has, it would be great to see Grailbird show up with the other great stuff you have released to Github. I found myself wanting to quickly search across tweets looking for things, like geo-enabled tweets (to make sure that they are there). I could look at the minified Grailbird source in Chrome using developer tools, but it wasn&#8217;t good enough for me to figure out how to dynamically load data. I resorted to using NodeJS, and evaling the JavaScript files&#8230;and was able to confirm that there is geo data in the archives if you have it enabled. Here&#8217;s the simplistic script I came up with:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="javascript" style="font-family:monospace;"><span style="color: #000066; font-weight: bold;">var</span> fs <span style="color: #339933;">=</span> require<span style="color: #009900;">&#40;</span><span style="color: #3366CC;">'fs'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000066; font-weight: bold;">var</span> Grailbird <span style="color: #339933;">=</span> <span style="color: #009900;">&#123;</span>data<span style="color: #339933;">:</span> <span style="color: #009900;">&#123;</span><span style="color: #009900;">&#125;</span><span style="color: #009900;">&#125;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #006600; font-style: italic;">// load all the tweet data</span>
eval<span style="color: #009900;">&#40;</span>fs.<span style="color: #660066;">readFileSync</span><span style="color: #009900;">&#40;</span><span style="color: #3366CC;">&quot;data/js/tweet_index.js&quot;</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;utf8&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #000066; font-weight: bold;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">var</span> i <span style="color: #339933;">=</span> <span style="color: #CC0000;">0</span><span style="color: #339933;">;</span> i <span style="color: #339933;">&lt;</span> tweet_index.<span style="color: #660066;">length</span><span style="color: #339933;">;</span> i<span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
  eval<span style="color: #009900;">&#40;</span>fs.<span style="color: #660066;">readFileSync</span><span style="color: #009900;">&#40;</span>tweet_index<span style="color: #009900;">&#91;</span>i<span style="color: #009900;">&#93;</span>.<span style="color: #660066;">file_name</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;utf8&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #006600; font-style: italic;">// look at each tweet and print out the date and geolocation if it's there</span>
<span style="color: #000066; font-weight: bold;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">var</span> slice <span style="color: #000066; font-weight: bold;">in</span> Grailbird.<span style="color: #660066;">data</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
  <span style="color: #000066; font-weight: bold;">for</span> <span style="color: #009900;">&#40;</span><span style="color: #000066; font-weight: bold;">var</span> j <span style="color: #339933;">=</span> <span style="color: #CC0000;">0</span><span style="color: #339933;">;</span> j <span style="color: #339933;">&lt;</span> Grailbird.<span style="color: #660066;">data</span><span style="color: #009900;">&#91;</span>slice<span style="color: #009900;">&#93;</span>.<span style="color: #660066;">length</span><span style="color: #339933;">;</span> j<span style="color: #339933;">++</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#123;</span>
    <span style="color: #000066; font-weight: bold;">var</span> tweet <span style="color: #339933;">=</span> Grailbird.<span style="color: #660066;">data</span><span style="color: #009900;">&#91;</span>slice<span style="color: #009900;">&#93;</span><span style="color: #009900;">&#91;</span>j<span style="color: #009900;">&#93;</span><span style="color: #339933;">;</span>
    <span style="color: #000066; font-weight: bold;">if</span> <span style="color: #009900;">&#40;</span>tweet.<span style="color: #660066;">geo</span>.<span style="color: #660066;">coordinates</span><span style="color: #009900;">&#41;</span> console.<span style="color: #660066;">log</span><span style="color: #009900;">&#40;</span>tweet.<span style="color: #660066;">created_at</span><span style="color: #339933;">,</span> <span style="color: #3366CC;">&quot;,&quot;</span><span style="color: #339933;">,</span> tweet.<span style="color: #660066;">geo</span>.<span style="color: #660066;">coordinates</span>.<span style="color: #660066;">join</span><span style="color: #009900;">&#40;</span><span style="color: #3366CC;">&quot;,&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  <span style="color: #009900;">&#125;</span>
<span style="color: #009900;">&#125;</span></pre></td></tr></table></div>

<p>and the output for <a href="https://twitter.com/adactio">Jeremy Keith&#8217;s</a> archive.</p>
<pre>
% node geo.js
Fri Nov 30 13:08:33 +0000 2012,50.8262027605,-0.138112306595
Sat Nov 17 12:09:18 +0000 2012,54.6000387923,-5.9254288673
Fri Nov 16 22:32:03 +0000 2012,54.5925614526,-5.930852294
Thu Nov 15 13:35:35 +0000 2012,54.595909,-5.922033
Sat Nov 10 12:59:37 +0000 2012,50.825832,-0.142381
Fri Nov 09 13:54:51 +0000 2012,50.8262027605,-0.1381123066
Wed Nov 07 18:07:24 +0000 2012,50.825977,-0.138339
Tue Nov 06 16:58:49 +0000 2012,50.8378257671,-1.1800042739
Tue Oct 30 11:19:53 +0000 2012,50.8262027605,-0.1381123066
Thu Oct 18 17:51:22 +0000 2012,43.0733634985,-89.38608062
Tue Oct 16 17:29:20 +0000 2012,43.0872606735,-89.3659955263
Tue Oct 09 18:11:20 +0000 2012,40.7406891129,-74.0076184273
Sun Oct 07 14:27:50 +0000 2012,50.82906975,-0.126056
Sat Oct 06 16:29:30 +0000 2012,50.825832,-0.142381
Thu Oct 04 16:46:56 +0000 2012,50.8262027605,-0.1381123066
Tue Oct 02 17:46:42 +0000 2012,50.826646,-0.136921
Mon Oct 01 10:46:04 +0000 2012,50.8262027605,-0.1381123066
Mon Oct 01 10:43:46 +0000 2012,50.8262027605,-0.1381123066
Mon Oct 01 09:38:01 +0000 2012,50.8236703111,-0.1387184062
Mon Oct 01 08:53:15 +0000 2012,50.8236703111,-0.1387184062
Thu Sep 27 13:05:16 +0000 2012,59.915652,10.749959
Sun Sep 23 12:54:16 +0000 2012,50.8281663943,-0.128531456
Sat Sep 22 13:44:09 +0000 2012,50.87447886,0.017625
Thu Sep 20 13:16:11 +0000 2012,50.8262027605,-0.1381123066
Thu Sep 20 09:27:55 +0000 2012,50.8262027605,-0.1381123066
Mon Sep 17 07:51:20 +0000 2012,47.9952739036,7.8525775405
Sun Sep 16 09:01:28 +0000 2012,51.1599172667,-0.1787844393
Thu Sep 13 12:40:26 +0000 2012,50.822951,-0.136905
Tue Sep 11 18:41:47 +0000 2012,50.822746,-0.142274
Tue Sep 11 17:19:38 +0000 2012,50.822219,-0.140802
Tue Sep 11 13:05:59 +0000 2012,50.8262027605,-0.1381123066
Tue Sep 11 13:03:35 +0000 2012,50.8262027605,-0.1381123066
Tue Sep 11 12:48:51 +0000 2012,50.8262027605,-0.1381123066
Tue Sep 11 12:06:36 +0000 2012,50.8262027605,-0.1381123066
Tue Sep 11 08:23:00 +0000 2012,50.8262027605,-0.1381123066
Sun Sep 09 19:10:21 +0000 2012,50.826646,-0.136921
Tue Sep 04 17:33:44 +0000 2012,50.826646,-0.136921
Tue Sep 04 12:57:16 +0000 2012,50.822951,-0.136905
Mon Sep 03 16:03:37 +0000 2012,50.8262027605,-0.1381123066
Mon Sep 03 15:26:41 +0000 2012,50.8262027605,-0.1381123066
Sun Sep 02 19:40:38 +0000 2012,50.8229428584,-0.1390289018
Sun Sep 02 19:24:45 +0000 2012,50.8229428584,-0.1390289018
Sun Sep 02 19:08:55 +0000 2012,50.825977,-0.138339
Sun Sep 02 18:25:08 +0000 2012,50.825449,-0.137123
Sun Sep 02 17:04:15 +0000 2012,50.825449,-0.137123
Sun Sep 02 15:34:31 +0000 2012,50.8229428584,-0.1390289018
Fri Aug 31 17:33:20 +0000 2012,50.8291396274,-0.133923449
Fri Aug 31 09:20:04 +0000 2012,50.8311581116,-0.1335176435
Tue Aug 28 20:44:32 +0000 2012,41.8844650304,-87.6257600109
Mon Aug 27 13:57:24 +0000 2012,41.8844650304,-87.6257600109
Sat Aug 25 18:45:51 +0000 2012,41.8851594291,-87.6232355833
Wed Aug 22 12:32:45 +0000 2012,50.824415,-0.134691
Tue Aug 21 11:39:46 +0000 2012,50.8262027605,-0.1381123066
Mon Aug 20 11:01:28 +0000 2012,51.535132,-0.069309
Fri Aug 17 12:03:40 +0000 2012,50.8262027605,-0.1381123066
Sat Aug 11 16:08:13 +0000 2012,50.826646,-0.136921
Fri Aug 10 14:25:15 +0000 2012,50.8262027605,-0.1381123066
Wed Aug 08 11:51:45 +0000 2012,50.8262027605,-0.1381123066
Tue Aug 07 15:45:49 +0000 2012,50.8262027605,-0.1381123066
Fri Aug 03 16:38:55 +0000 2012,50.8262027605,-0.1381123066
Fri Aug 03 14:33:04 +0000 2012,50.8262027605,-0.1381123066
Sat Jul 28 14:57:52 +0000 2012,50.825449,-0.137123
Sat Jul 28 12:09:01 +0000 2012,50.828404,-0.137435
Thu Jul 26 17:17:22 +0000 2012,50.8266230357,-0.1367429505
Tue Jul 24 15:07:39 +0000 2012,50.8262027605,-0.1381123066
Mon Jul 23 12:25:35 +0000 2012,50.823104,-0.139515
Sat Jul 21 12:46:25 +0000 2012,50.827943,-0.136033
Fri Jul 20 13:21:41 +0000 2012,50.8262027605,-0.1381123066
Mon Jul 16 19:28:01 +0000 2012,50.825449,-0.137123
Sun Jul 15 10:48:44 +0000 2012,51.4714930776,-0.4883337021
Sat Jul 14 23:08:27 +0000 2012,41.974037,-87.890239
Tue Jul 10 13:44:08 +0000 2012,30.2655234842,-97.7385378752
Mon Jul 09 19:32:48 +0000 2012,30.2655234842,-97.7385378752
Mon Jul 09 14:40:21 +0000 2012,30.2656095537,-97.7385592461
Sat Jul 07 15:08:12 +0000 2012,51.4726745412,-0.4817537462
Fri Jun 29 10:55:03 +0000 2012,50.8262027605,-0.1381123066
Wed Jun 20 10:23:29 +0000 2012,51.488197,-0.120692
Mon Jun 18 12:12:01 +0000 2012,50.8262027605,-0.1381123066
Mon Jun 18 12:02:43 +0000 2012,50.8262027605,-0.1381123066
Sat Jun 16 15:51:15 +0000 2012,50.8244773427,-0.1387893509
Sat Jun 16 15:10:29 +0000 2012,50.827972412,-0.136271402
Fri Jun 15 22:15:44 +0000 2012,50.947306,0.090209
Fri Jun 15 12:58:27 +0000 2012,50.947306,0.090209
Wed Jun 13 12:12:49 +0000 2012,50.822951,-0.136905
Mon Jun 11 14:05:50 +0000 2012,50.825977,-0.138339
Wed Jun 06 16:31:48 +0000 2012,51.50361668,-0.683839
Wed Jun 06 15:38:45 +0000 2012,51.50361668,-0.683839
Sat Jun 02 15:40:48 +0000 2012,50.825449,-0.137123
Fri Jun 01 13:29:40 +0000 2012,50.8262027605,-0.1381123066
Thu May 31 16:37:18 +0000 2012,50.8262027605,-0.1381123066
Wed May 30 14:58:46 +0000 2012,50.8262027605,-0.1381123066
Wed May 30 12:45:33 +0000 2012,50.8262027605,-0.1381123066
Wed May 30 12:32:27 +0000 2012,50.8262027605,-0.1381123066
Tue May 29 12:12:15 +0000 2012,50.8242644595,-0.1329624653
Tue May 29 08:12:24 +0000 2012,50.8307708894,-0.1330473622
Sun May 27 21:06:57 +0000 2012,47.5608179303,-52.70936785
Mon May 21 19:15:05 +0000 2012,50.824975,3.26387
Mon May 21 13:56:02 +0000 2012,51.0541040608,3.7238935404
Mon May 21 12:19:17 +0000 2012,51.055163,3.720835
Sat May 19 15:52:22 +0000 2012,50.821309,-0.1434404
Sat May 19 14:19:38 +0000 2012,50.822215,-0.154896
Sun May 13 14:08:33 +0000 2012,50.8244462443,-0.139321602
Sun May 13 13:29:30 +0000 2012,50.8192217888,-0.1411056519
Sat May 12 19:32:13 +0000 2012,50.820359,-0.14243
Sat May 12 17:51:57 +0000 2012,50.822623,-0.142676
Fri May 11 09:22:05 +0000 2012,52.366239,4.894655
Tue May 08 12:39:36 +0000 2012,50.8287188784,-0.1423922896
Sun May 06 20:38:27 +0000 2012,50.871762,0.011501
Fri May 04 14:35:37 +0000 2012,50.8262027605,-0.1381123066
Thu May 03 16:03:52 +0000 2012,50.8262027605,-0.1381123066
Thu May 03 12:05:08 +0000 2012,50.8242644595,-0.1329624653
Wed May 02 12:43:38 +0000 2012,50.8262027605,-0.1381123066
Tue May 01 14:50:47 +0000 2012,50.8244094849,-0.1399479955
Tue May 01 13:17:36 +0000 2012,50.8262027605,-0.1381123066
Tue May 01 12:01:59 +0000 2012,50.826779,-0.138462
Tue May 01 11:22:41 +0000 2012,50.8262027605,-0.1381123066
Mon Apr 30 15:58:14 +0000 2012,50.8262027605,-0.1381123066
Fri Apr 27 17:26:19 +0000 2012,50.825449,-0.137123
Thu Apr 26 12:44:54 +0000 2012,50.8262027605,-0.1381123066
Tue Apr 24 11:30:25 +0000 2012,50.8262027605,-0.1381123066
Sat Apr 21 14:37:59 +0000 2012,50.8244773427,-0.1387893509
Wed Apr 18 11:05:28 +0000 2012,51.514461,-0.15415
Tue Apr 17 11:38:39 +0000 2012,50.8262027605,-0.1381123066
Mon Apr 16 17:28:09 +0000 2012,50.825449,-0.137123
Fri Apr 13 17:35:30 +0000 2012,50.825449,-0.137123
Fri Apr 13 11:39:01 +0000 2012,50.8262027605,-0.1381123066
Thu Apr 12 20:59:46 +0000 2012,50.8284865994,-0.1406764984
Thu Apr 12 20:43:24 +0000 2012,50.8284865994,-0.1406764984
Thu Apr 12 12:38:06 +0000 2012,50.8262027605,-0.1381123066
Wed Apr 04 17:35:46 +0000 2012,50.829236,-0.130433
Wed Apr 04 11:20:06 +0000 2012,50.8262027605,-0.1381123066
Wed Mar 28 19:51:57 +0000 2012,50.82533,-0.1371919
Wed Mar 28 17:41:06 +0000 2012,50.8266230357,-0.1367429505
Sat Mar 24 15:24:22 +0000 2012,50.82578,-0.139591
Sat Mar 24 14:42:14 +0000 2012,50.8244773427,-0.1387893509
Thu Mar 22 20:33:36 +0000 2012,50.821049,-0.140416
Thu Mar 15 16:00:20 +0000 2012,32.8975517297,-97.0442533493
Wed Mar 14 15:41:13 +0000 2012,30.265426,-97.740498
Tue Mar 13 19:52:43 +0000 2012,30.2647199679,-97.7443528175
Tue Mar 13 16:29:12 +0000 2012,30.2653850259,-97.7383099888
Mon Mar 12 02:03:53 +0000 2012,30.2669212002,-97.745683415
Sun Mar 11 17:45:31 +0000 2012,30.2626071693,-97.739803791
Sun Mar 11 15:18:53 +0000 2012,30.2647199679,-97.7443528175
Fri Mar 09 15:11:51 +0000 2012,30.2671521557,-97.7396624407
Mon Mar 05 10:56:37 +0000 2012,50.8262027605,-0.1381123066
Thu Mar 01 09:55:16 +0000 2012,50.8304057758,-0.1329698575
Wed Feb 22 23:56:59 +0000 2012,-33.8782765912,151.221249511
Wed Feb 22 02:00:43 +0000 2012,-41.328228677,174.809947014
Thu Feb 16 01:13:27 +0000 2012,-41.2890508786,174.777774995
Wed Feb 15 21:39:06 +0000 2012,-41.2893031956,174.777374268
Wed Feb 15 18:50:42 +0000 2012,-41.2893031956,174.777374268
Wed Feb 15 02:10:18 +0000 2012,-41.29336192,174.776485
Mon Feb 13 04:07:07 +0000 2012,-41.2893031956,174.777374268
Mon Feb 13 03:36:49 +0000 2012,-41.2924914456,174.776140451
Mon Feb 13 03:00:13 +0000 2012,-41.293314,174.776395
Mon Feb 13 02:40:18 +0000 2012,-41.2934345895,174.775958061
Mon Feb 13 01:22:04 +0000 2012,-41.2939726591,174.775840044
Sat Feb 11 23:39:04 +0000 2012,-36.405247,174.65600431
Sat Feb 11 07:32:16 +0000 2012,-36.405247,174.65600431
Sat Feb 11 06:49:42 +0000 2012,-36.405247,174.65600431
Wed Feb 08 23:20:25 +0000 2012,-33.878302,151.221256
Sat Feb 04 11:14:52 +0000 2012,50.828205,-0.1378011703
Thu Feb 02 13:41:42 +0000 2012,50.8262027605,-0.1381123066
Wed Feb 01 16:57:16 +0000 2012,50.8262027605,-0.1381123066
Sat Jan 28 16:57:35 +0000 2012,50.827062,-0.135349
Sat Jan 28 15:55:49 +0000 2012,50.828295,-0.138769
Thu Jan 26 12:42:08 +0000 2012,50.8262027605,-0.1381123066
Mon Jan 23 12:34:45 +0000 2012,50.822219,-0.140802
Sun Jan 22 15:18:32 +0000 2012,50.825832,-0.142381
Sat Jan 21 14:27:51 +0000 2012,50.8213,-0.1409
Fri Jan 20 12:45:34 +0000 2012,51.9479484763,-0.5020558834
Thu Jan 19 20:49:09 +0000 2012,52.9556027724,-1.1504852772
Thu Jan 19 12:38:47 +0000 2012,52.954584773,-1.1563324928
Wed Jan 18 16:42:24 +0000 2012,52.954584773,-1.1563324928
Wed Jan 18 16:39:09 +0000 2012,52.954584773,-1.1563324928
Tue Jan 17 15:00:09 +0000 2012,50.8262027605,-0.1381123066
Mon Jan 16 10:03:12 +0000 2012,50.8303548561,-0.1329055827
Sat Jan 14 16:11:55 +0000 2012,50.824838842,-0.1516896486
Wed Jan 11 21:07:19 +0000 2012,51.522789913,-0.0784921646
Wed Jan 11 19:27:24 +0000 2012,51.5237223711,-0.0770612686
Sat Jan 07 14:49:09 +0000 2012,50.824424,-0.138875
...
Fri Apr 09 01:52:12 +0000 2010,47.4412234282,-122.3010026978
Fri Apr 09 00:00:15 +0000 2010,47.4432422071,-122.3010595342
Thu Apr 08 01:29:11 +0000 2010,47.6873506139,-122.3341637453
Wed Apr 07 00:16:03 +0000 2010,47.6109922102,-122.3480262842
Sun Apr 04 18:47:33 +0000 2010,47.7083958758,-122.3272574643
Sat Apr 03 18:06:54 +0000 2010,47.6687063559,-122.3942997359
Sat Apr 03 18:05:00 +0000 2010,47.6687063559,-122.3942997359
</pre>
<p>I guess it's kind of scary that you can do this, and is perhaps why Twitter doesn't let you export anyone's account, even if it is public. But returning to the issue of Grailbird being on Github, I imagine there would be people that would write code that uses Grailbird as an API to the archive data, to provide extensions that would display a map of where you've been over time for example, or an analysis of your friendship network, or a view on hashtags you've used, events you've been at etc. </p>
<p>I think from an archival perspective, it would be really useful to be able to receive something like a Tweet archive from a donor, and overlay functionality on top of it. The model of using the Web as a local application platform for this sort of archival content seems like it could be a growth area.</p>
]]></content:encoded>
			<wfw:commentRss>http://inkdroid.org/journal/2012/12/31/archiving-tweets/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Inside Out Libraries</title>
		<link>http://inkdroid.org/journal/2012/12/18/inside-out-libraries/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=inside-out-libraries</link>
		<comments>http://inkdroid.org/journal/2012/12/18/inside-out-libraries/#comments</comments>
		<pubDate>Wed, 19 Dec 2012 05:30:17 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[libraries]]></category>
		<category><![CDATA[web]]></category>
		<category><![CDATA[archives]]></category>
		<category><![CDATA[ebooks]]></category>
		<category><![CDATA[elag]]></category>
		<category><![CDATA[ifla]]></category>
		<category><![CDATA[publishing]]></category>

		<guid isPermaLink="false">http://inkdroid.org/journal/?p=5330</guid>
		<description><![CDATA[Peter Brantley tells a sad tale about where public library leadership is at, as we plunge headlong into the ebook future, that has been talked about for what seems like forever, and which is now upon us. It&#8217;s not pretty. The general consensus among participants was that public libraries have two, maybe three years to [...]]]></description>
				<content:encoded><![CDATA[<p>Peter Brantley tells a <a href="http://www.publishersweekly.com/pw/by-topic/industry-news/libraries/article/55131-you-have-two-maybe-three-years.html">sad tale</a> about where public library leadership is at, as we plunge headlong into the ebook future, that has been talked about for what seems like forever, and which is now upon us. It&#8217;s not pretty.</p>
<blockquote><p>
The general consensus among participants was that public libraries have two, maybe three years to establish their relevance in the digital realm, or risk fading from the central place they have long occupied in the world’s literary culture.
</p></blockquote>
<p>The fact that a bunch of big-wigs invited by IFLA were seemingly unable to find inspiration and reason to hope that public libraries will continue to exist is not surprising in the least I guess. I&#8217;m not sure that libraries were ever the center of the world&#8217;s literary culture. But for the sake of argument lets assume they were, and that now they&#8217;re increasingly not. Let us also assume that the economic landscape around ebooks is in incredible turmoil, and that there will continue to be sea changes in technologies, and people&#8217;s use of them in this area for the foreseeable future.</p>
<p>What can libraries do to stay relevant? I think part of the answer is: stop being libraries&#8230;well, sorta.</p>
<h2>The HyperLocal</h2>
<blockquote><p>
The most serious threat facing libraries does not come from publishers, we argued, but from e-book and digital media retailers like Amazon, Apple, and Google. While some IFLA staff protested that libraries are not in the business of competing with such companies, the library representatives stressed that they are. <strong>If public libraries can’t be better than Google or Amazon at something, then libraries will lose their relevance.</strong>
</p></blockquote>
<p>In my mind the thing that libraries have to offer, which these big corporations cannot, is authentic, local context for information about a community&#8217;s past, present and future. But in the past century or so libraries have focused on collecting mass produced objects, and sharing data about said objects. The mission of collecting hyper-local information has typically been a side task, that has fallen to special collections and archives. If I were invited to that IFLA meeting I would&#8217;ve said that libraries need to shift their orientation to caring more about the practices of archives and manuscript collections, by collecting unique, valued, at risk local materials, and adapting collection development and descriptive practices to the realities of more and more of this information being available as data. </p>
<p>As Mark Matienzo <a href="https://twitter.com/anarchivist/status/281460231769313280">indicated</a> (somewhat indirectly in Twitter) after I published this blog post, a lot of this work involves focusing less on hoarding items like books, and focusing more on the functions, services, and actions that public libraries want to document and engage with in their communities. Traditionally this orientation has been a strength area for archivists in their practice and theory of <a href="https://en.wikipedia.org/wiki/Archival_appraisal">appraisal</a>  where: </p>
<blockquote cite="https://en.wikipedia.org/wiki/Archival_appraisal"><p>
&#8230; considerations &#8230; include how to meet the record-granting body’s organizational needs, how to uphold requirements of organizational accountability (be they legal, institutional, or determined by archival ethics), and how to meet the expectations of the record-using community. <a href="https://en.wikipedia.org/wiki/Archival_appraisal">Wikipedia</a>
</p></blockquote>
<p>I think this represents a pretty significant cognitive shift for library professionals, and would in fact take some doing. But perhaps that&#8217;s just because my exposure to archival theory in &#8220;library school&#8221; was pretty pathetic. Be that as it may here are some practical examples of growth areas for public libraries that I wish came up at the IFLA meeting.</p>
<h2>Web Archiving</h2>
<p>The Internet Archive and national libraries that are part of the International Internet Preservation Consortium don&#8217;t have the time, resources and often mandate to collect web content that are of interest at the local level. What if the tooling and expertise existed for public libraries to perform some of this work, and to have the results fed into larger aggregations of web archives?</p>
<h2>Municipality Reports and Data</h2>
<p>Increasing amounts of data are being collected as part of the daily working of our local governments. What if your public library had the resources to be a repository for this data? Yeah, I said the R word. But I&#8217;m not suggesting that public libraries get the expertise to set up Fedora instances with Hydra heads, or something. I&#8217;m thinking about approaches to allowing data to easily flow into an organization, where it is backed up, and made available in a clearinghouse manner similar to <a href="https://public.resource.org/">public.resource.org</a> on the Web, for search engines to pick up. Perhaps even services like <a href="http://jasongriffey.net/librarybox/">LibraryBox</a> offer another lens to look at the opportunities that lie in this area.</p>
<h2>Born Digital Manuscript Collections</h2>
<p>Public libraries should be aggressively collecting the &#8220;papers&#8221; of local people who have had significant contributions to their communities. Increasingly, these aren&#8217;t paper at all, but are born digital content. For example: email correspondence, document archives, digital photograph collections. I think that librarians and archivists know, in theory, that this born digital content is out there, but the reality is it&#8217;s not flowing into the public library/archive. How can we change this? Efforts such as <a href="http://www.personalarchiving.com/">Personal Digital Archiving</a> are important for two reasons: they help set up the right conditions for born digital collections to be donated, and they also make professionals think about how they would like to receive materials so that they are easier to process. Think more things like <a href="http://born-digital-archives.blogspot.com/">AIMS</a>, training and tooling for both professionals and citizens.</p>
<h2>Licensing</h2>
<p>It&#8217;s not unusual for archives and special collections to have all sorts of donor gift agreements that place restrictions on how their donated materials can be used. To some extent needing to visit the collection, request it, and not being able to leave the room with it, has mitigated some of this special-snowflakism. But when things are online things change a bit. We need to normalize these agreements so that content can flow online, and be used online in clearer ways. What if we got donors to think about Creative Commons licenses when they donated materials? How can we make sure donated material can become a usable part of the Web</p>
<h2>Persistence</h2>
<p>We all know that things come and go on the Web. But it doesn&#8217;t need to be that way for everything on the Web. Libraries and archives have an opportunity to show how focusing on being a clearninghouse for data assets can allow for things to live persistently on the Web. Thinking about our URLs as identifiers for things we are taking care of is important. Practical strategies for achieving that are possible, and repeatable. What if public libraries were safe harbors for local content on the World Wide Web? This might sound hard to do, but I think it&#8217;s not as hard as people think.</p>
<h2>Metrics</h2>
<p>As libraries/archives make more local content available publicly on the Web it becomes important to track how this content is accessed and used online. Quick wins like Web analytics tools (Google Analytics) for seeing what is being accessed and from where. Seeing how content is cited in social media applications like Facebook, Twitter, Pinterest and Wikipedia is important for reporting on the value of online collections. But encouraging professionals to use this information to become part of the conversations is equally important. Good metrics are also essential for collection development purposes, seeing what content is of interest, and what is not.</p>
<h2>Inside Out Libraries</h2>
<p>So, no I don&#8217;t think public libraries need a new open source Overdrive. The ebook market will likely continue to take care of itself. I also am not really convinced we need some overarching organization like the Digital Public Library of America to serve as a single point of failure when the funding runs dry. We need distributed strategies for documenting our local communities, so that this information can take its rightful place on the Web, and be picked up by Google so that people can find it when they are on the other side of the world. Things will definitely keep changing, but I think libraries and archives need to invest in the Web as an enduring delivery platform for information.</p>
<p>I&#8217;ve never been before but I was so excited to read the <a href="http://elag2013.org/program/call-for-papers/">call</a> for the European Library Automation Group (ELAG) this year.</p>
<blockquote><p>
The theme of this year’s conference is ‘The INSIDE-OUT Library’. This theme was chosen at last year’s conference, because we concluded:</p>
<ul>
<li>Libraries have been focusing on bringing the world to their users. Now information is globally available.</li>
<li>Libraries have been producing metadata for the same publications in parallel. Now they are faced with deduplicating redundancy.</li>
<li>Libraries have been selecting things for their users. Now the users select things themselves.</li>
<li>Libraries have been supporting users by indexing things locally. Now everything is being indexed in global, shared indexes.</li>
</ul>
<p>Instead of being an OUTSIDE-IN library, libraries should try and stay relevant by shifting their paradigm 180 degrees. Instead of only helping users to find what is available globally, they should also focus on making local collections and production available to the world. Instead of doing the same thing everywhere, libraries should focus on making unique information accessible. Instead of focusing on information trapped in publications, libraries should try and give the world new views on knowledge.
</p></blockquote>
<p>This blog post is really just a somewhat shabby rephrasing of that call. Maybe IFLA could use some of the folks on the ELAG program commmittee at their next meeting about the future of public libraries? Hopefully 2013 will be a year I can make it to ELAG.</p>
<p>I expect public libraries will continue to exist, but there isn&#8217;t going to be some magical technical solution to their problems. Their future will be forged by each local relationship they make, which leads to them better documenting their place on the Web. We may not call these places public libraries at first, but that&#8217;s what they will be.</p>
]]></content:encoded>
			<wfw:commentRss>http://inkdroid.org/journal/2012/12/18/inside-out-libraries/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>linkrot: use your illusion</title>
		<link>http://inkdroid.org/journal/2012/11/01/linkrot-use-your-illusio/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=linkrot-use-your-illusio</link>
		<comments>http://inkdroid.org/journal/2012/11/01/linkrot-use-your-illusio/#comments</comments>
		<pubDate>Thu, 01 Nov 2012 15:01:04 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[archives]]></category>
		<category><![CDATA[libraries]]></category>
		<category><![CDATA[web]]></category>
		<category><![CDATA[linkrot]]></category>
		<category><![CDATA[terms of service]]></category>
		<category><![CDATA[url]]></category>

		<guid isPermaLink="false">http://inkdroid.org/journal/?p=5318</guid>
		<description><![CDATA[Mike Giarlo wrote a bit last week about the issues of citing datasets on the Web with Digital Object Identifiers (DOI). It&#8217;s a really nice, concise characterization of why libraries and publishers have promoted and used the DOI, and indirect identifiers more generally. Mike defines indirect identifiers as &#8230; identifiers that point at and resolve [...]]]></description>
				<content:encoded><![CDATA[<p>Mike Giarlo <a href="http://www.personal.psu.edu/mjg36/blogs/2012/10/understanding-eg-dois-for-data-sets.html">wrote a bit</a> last week about the issues of citing datasets on the Web with <a href="https://en.wikipedia.org/wiki/Digital_object_identifier">Digital Object Identifiers (DOI)</a>. It&#8217;s a really nice, concise characterization of why libraries and publishers have promoted and used the DOI, and <em>indirect identifiers</em> more generally. Mike defines indirect identifiers as</p>
<blockquote><p>
&#8230; identifiers that point at and resolve to other identifiers.
</p></blockquote>
<p>I might be reading between the lines a bit, but I think Mike is specifically talking about any identifier that has some documented or ad-hoc mechanism for turning it into a Web identifier, or <a href="https://en.wikipedia.org/wiki/Uniform_Resource_Locator">URL</a>. A quick look at the Wikipedia <a href="https://en.wikipedia.org/wiki/Category:Identifiers">identifier</a> category yields lots of these, many of which (but not all) can be expressed as a <a href="https://en.wikipedia.org/wiki/Uniform_Resource_Identifier">URI</a>.</p>
<p>The reason why I liked Mike&#8217;s post so much is that he was able to neatly summarize the psychology that drives the use of indirect identifier technologies:</p>
<blockquote><p>
&#8230; cultural heritage organizations and publishers have done a pretty poor job of persisting their identifiers so far, partly because they didn’t grok the commitment they were undertaking, or because they weren’t deliberate about crafting sustainable URIs from the outset, or because they selected software with brittle URIs, or because they fell flat on some area of sustainability planning (financial, technical, or otherwise), and so because you can’t trust these organizations or their software with your identifiers, you should use this other infrastructure for minting and managing quote persistent unquote identifiers
</p></blockquote>
<p>Mike goes on to get to the heart of the problem, which is that indirect identifier technologies don&#8217;t solve the problem of broken links on the Web, they just push it elsewhere. The real problem of maintaining the indirect identifier when the actual URL changes becomes <em>someone else&#8217;s problem</em>. Out of sight, out of mind &#8230; except it&#8217;s not really out of sight right? Unless you don&#8217;t really care about the content you are putting online. </p>
<p>We all know that <a href="https://en.wikipedia.org/wiki/Link_rot">linkrot</a> on the Web is <a href="http://arxiv.org/abs/1105.3459">a real thing</a>. I would be putting my head in the sand if I were to say it wasn&#8217;t. But I would also be putting my head in the sand if I said that things don&#8217;t <a href="http://www.washingtonpost.com/wp-dyn/content/article/2007/10/23/AR2007102301784.html">go missing</a> from our brick and mortar libraries. But still, we should be able to do better than 1/2 the URLs in <a href="http://arxiv.org">arXiv</a> going dead right? I make a living as a web developer, I&#8217;m an occasional advocate for linked data, and I&#8217;m a big fan of the <a href="http://www.w3.org/2001/tag/doc/URNsAndRegistries-50">work</a> Henry Thompson and David Orchard did for the W3C analyzing the use of alternate identifier schemes on the Web&#8230;so, admittedly, I&#8217;m a bit of a zealot when it comes to promoting URLs as identifiers, and taking the Web seriously as an information space.</p>
<p>Mike&#8217;s post actually kicked off what I thought was a useful Twitter <a href="https://twitter.com/mjgiarlo/status/262373950447837184">conversation</a> (yes they can happen), which left me contemplating the future of libraries and archives on (or in) the Web. Specifically, it got me thinking that perhaps libraries and archives of the not too distant future will be places that take special care in how they put content on the Web, so that it can be accessed over time, just like a traditional physical library or archive. The places where links and the content they reference are less likely to go dead will be the new libraries and archives. These may not be the same institutions we call libraries today. Just like today&#8217;s libraries, these new libraries may not necessarily be free to access. You may need to be part of some community to access them, or to pay some sort of subscription fee. But some of them, and I hope most, will be public assets.</p>
<p>So how to make this happen? What will it look like? Rather than advocating a particular identifier technology I think these new libraries need to think seriously about providing <a href="https://en.wikipedia.org/wiki/Terms_of_service">Terms of Service</a> documents for their content services. I think these library ToS documents will do a few things.</p>
<ul>
<li>They will require the library to think seriously about the service they are providing. This will involve meetings, more meetings, power lunches, and likely lawyers. The outcome will be an organizational understanding of what the library is putting on the Web, and the commitment they are entering into with their users. It won&#8217;t simply be a matter of a web development team deciding to put up some new website&#8230;or take one down. This will likely be hard, but I think it&#8217;s getting easier all the time, as the importance of the Web as a publishing platform becomes more and more accepted, even in conservative organizations like libraries and archives.</li>
<li>The ToS will address the institutions commitment for continued access to the content. This will involve a clear understanding of the URL namespaces that the library manages, and a statement about how they will be maintained over time. The Web has built in mechanisms for content moving from place to place (<a href="https://en.wikipedia.org/wiki/HTTP_301">HTTP 301</a>), and for when resources are removed (<a href="https://en.wikipedia.org/wiki/HTTP_401#4xx_Client_Error">HTTP 410</a>), so URLs don&#8217;t need to be written in stone. But the library needs to commit to how resources will redirect permanently to new locations, and for how long&#8211;and how they will be removed.</li>
<li>The ToS will explicitly state the licensing associated with the content, preferably with Creative Commons licenses (hey I&#8217;m daydreaming here) so that it can be confidently used.</li>
<li>Libraries and archives will develop a shared palette of ToS documents. Each institution won&#8217;t have it&#8217;s own special snowflake ToS that nobody reads. There will be some normative patterns for different types of libraries. They will be shared across consortia, and among peer institutions. Maybe they will be incorporated into, or reflect shared principles found in documents like ALA&#8217;s <a href="https://en.wikipedia.org/wiki/Library_Bill_of_Rights">Library Bill of Rights</a> or SAA&#8217;s <a href="http://www2.archivists.org/statements/saa-core-values-statement-and-code-of-ethics">Code of Ethics.</a></li>
</ul>
<p>I guess some of this might be a bit reminiscent of the work that has gone into what makes a <a href="http://www.crl.edu/Archiving%20%2526%20Preservation/Digital%20Archives/Metrics%20for%20Assessing%20and%20Certifying-0">trusted repository</a>. But I think a Terms of Service between a library/archive and its researcher is something a bit different. It&#8217;s more outward looking, less interested in certification and compliance and more interested in entering into and upholding a contract with the user of a collection.</p>
<p>As I was writing this post, Dan Brickley <a href="https://twitter.com/danbri/status/263834404592427010">tweeted</a> about a <a href="http://ecommons.eu/wp-content/uploads/Tony-Ageh-%E2%80%93%C2%A0The-Economies-of-Sharing.pdf">recent talk</a> Tony Ageh (head of the archive development team at the BBC) gave at the recent <a href="http://ecommons.eu/">Economies of the Commons</a> conference. He spoke about his ideas for a future Digital Public Space, and the role that archives and organizations like the BBC play in helping create it.</p>
<blockquote><p>
Things no longer ‘need’ to disappear after a certain period of time.  Material that once would have flourished only briefly before languishing under lock and key or even being thrown away — can now be made available forever. And our Licence Fee Payers increasingly expect this to be the way of things. We  will soon need to have a very, *very* good reason for why  anything at all disappears from view or is not permanently accessible in some way or other.</p>
<p>That is why the Digital Public Space has placed the  continuing and permanent availability of all publicly-funded  media, and its associated information, as the default and founding principle.
</p></blockquote>
<p>I think Tony and Mike are right. Cultural heritage organizations need to think more seriously, and more long term about the content they are putting on the Web. They need to put this thought into clear, and succinct contracts with their users. The organizations that do will be what we call libraries and archives tomorrow. I guess I need to start by getting my own house in order eh?</p>
]]></content:encoded>
			<wfw:commentRss>http://inkdroid.org/journal/2012/11/01/linkrot-use-your-illusio/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>level 0 linked archival data</title>
		<link>http://inkdroid.org/journal/2012/10/24/level-0-linked-archival-data/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=level-0-linked-archival-data</link>
		<comments>http://inkdroid.org/journal/2012/10/24/level-0-linked-archival-data/#comments</comments>
		<pubDate>Wed, 24 Oct 2012 10:29:49 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[archives]]></category>
		<category><![CDATA[libraries]]></category>
		<category><![CDATA[web]]></category>
		<category><![CDATA[archivegrid]]></category>
		<category><![CDATA[ead]]></category>
		<category><![CDATA[google]]></category>
		<category><![CDATA[html]]></category>
		<category><![CDATA[nodejs]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[xml]]></category>

		<guid isPermaLink="false">http://inkdroid.org/journal/?p=5183</guid>
		<description><![CDATA[Depósito del Archivo de la FundaciónSierra-Pambley TLDR; lets see if we can share structured archival data better by adding HTML &#60;link&#62; elements that point at our EAD XML files. A few weeks ago I attended a small meeting of DC museums, archives and libraries that were discussing what Linked Data means for Archives. Hillel Arnold [...]]]></description>
				<content:encoded><![CDATA[<div style="float: left; font-size: 8pt; text-align: center; margin-right: 10px;"><a href="https://en.wikipedia.org/wiki/File:Fondos_archivo.jpg"><img src="https://upload.wikimedia.org/wikipedia/commons/thumb/9/93/Fondos_archivo.jpg/220px-Fondos_archivo.jpg"/><br />Depósito del Archivo de la Fundación<br />Sierra-Pambley</a></div>
<p><strong><em>TLDR; lets see if we can share structured archival data better by adding HTML &lt;link&gt; elements that point at our EAD XML files.</em></strong></p>
<p>A few weeks ago I attended a small meeting of DC museums, archives and libraries that were discussing what Linked Data means for Archives. Hillel Arnold and I took collaborative notes in <a href="http://piratepad.net/IB2zcWvFDz">Pirate Pad</a>. For a good part of the time we went around the room talking about how we describe archival collections with various workflows using Encoded Archival Description (EAD), and how this was mostly working (or not).</p>
<p>Some good work has already been done imagining how Linked Data can transform archival description by the <a href="http://blogs.ukoln.ac.uk/locah/">LOCAH</a> (now <a href="http://archiveshub.ac.uk/linkinglives/">Linking Lives</a>) as well as the <a href="http://socialarchive.iath.virginia.edu/">Social Networks and Archival Context</a> project. I think tools like <a href="http://editorsnotes.org/">Editors&#8217; Notes</a>, <a href="http://www.cwrc.ca/projects/infrastructure-projects/technical-projects/cwrc-writer/">CWRC Writer</a>, and Google&#8217;s <a href="http://googledocs.blogspot.com/2012/05/find-facts-and-do-research-inside.html">Research Pane</a> could provide really useful models for how the work of an archivist could benefit from linking to external resources such as Wikipedia, dbpedia, VIAF, etc. But we really didn&#8217;t talk about that in too much detail. The focus instead was on various tools people used in their EAD workflows: Archivists&#8217; Toolkit, Oxygen, ExistDB, Access databases, etc &#8230; and the hope that <a href="http://www.archivesspace.org/">Archives Space</a> could possibly improve matters. We did touch briefly on what it means to make finding aids available on the Web, but not in a very satisfactory way.</p>
<p>I was really struck by how everyone was using EAD, even if their tools were different. I was also left with the lingering suspicion that not much of this EAD data was linked to from the HTML presentation of the finding aid. After some conversations it was also my understanding that even after 20 years of work on EAD, there is not a listing of websites that make EAD finding aids available. It seems particularly sad that institutions have invested a lot of time and effort in putting EAD into practice, and yet we still aren&#8217;t really sharing them very well with each other.</p>
<p>So in a bit of a fit of frustration I did some <a href="http://github.com/edsu/ead-finder">hacking</a> to see if I could use <a href="http://google.com?q=ead+filetype:xml">Google</a> and <a href="http://beta.worldcat.org/archivegrid">ArchiveGrid</a> to identify websites that serve up finding aids either as HTML or as EAD XML. I wanted to:</p>
<ol>
<li>Get a list of websites that made HTML and EAD XML finding aids available. We can rely on Google to index the Web, but maybe we could index the archival web a bit better ourselves if we had a better understanding of where the EAD data was available. The idea is that this initial list could be used to bootstrap a list of websites making EAD finding aids available in the Wikipedia entry for <a href="https://en.wikipedia.org/wiki/Encoded_Archival_Description">EAD</a>.</li>
<li>To see which websites have HTML representations that link to an EAD XML representation. The rationale here is to encourage a very simple best practice for linking to structured archival data when it is available. More on that below.</li>
</ol>
<p>I was able to identify 201 hosts that served up finding aids either as HTML or XML. You should be able to see them here in this <a href="https://docs.google.com/spreadsheet/ccc?key=0Ak6uboYXcJbBdEFMODhhN1dSaWlUNTRQX05pcmEtLWc#gid=0">spreadsheet</a>. I also collected URLs for finding aids (both HTML and XML) that I was able to locate, which can be seen in this <a href="https://github.com/edsu/ead-finder/blob/master/dump.json">JSON file</a>. </p>
<p>With the URLs in hand I wrote a little script to examine which of the 156 hosts serving up HTML representations of finding aids had a link to an XML EAD document. I looked for a very simple kind of link that was <a href="http://www.rssboard.org/rss-autodiscovery">popularized</a> by the RSS and Atom syndication community for autodiscovery of blog feeds. A <code>&lt;link&gt;</code> tag that has a <code>rel</code> attribute of <code>alternate</code> and a <code>type</code> attribute set to <code>application/xml</code>. Out of the 156 websites serving up HTML representations of finding aids I could only find two websites that used this link pattern: Princeton University and Emory University. </p>
<p>For example if you view the HTML source for the <a href="http://findingaids.princeton.edu/collections/C1022">Einstein Collection</a> finding aid at Princeton you&#8217;ll see this link:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="html" style="font-family:monospace;">&lt;link rel=&quot;alternate&quot; type=&quot;application/xml&quot; href=&quot;http://findingaids.princeton.edu/collections/C1022.xml&quot; /&gt;</pre></td></tr></table></div>

<p>Similarly the finding aid for the <a href="http://findingaids.library.emory.edu/documents/rushdie1000/">Salman Rushdie</a> collection at Emory University has this link:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="html" style="font-family:monospace;">&lt;link rel=&quot;alternate&quot; type=&quot;application/xml&quot; href=&quot;/documents/rushdie1000/EAD/&quot; /&gt;</pre></td></tr></table></div>

<p>As the title of this blog post suggests, I&#8217;m calling this pattern <em>level 0 linked data</em>. Linked Data purists would probably say this isn&#8217;t Linked Data at all since it doesn&#8217;t involve an RDF serialization. And I guess they would be right. But it does express a graph of HTML and EAD data that is linked, and it serves a real need. If you are interested in Linked Data and archives I encourage you to add these links to your HTML finding aids today. </p>
<p>So why is are these links important? </p>
<p>The main reason is they are found in HTML documents, which are the representations that matter most on the Web. HTML documents are read by people. They are hypertext documents that link to and from other places on an archives website and elswewhere on the Web at large. They are well understood technically by the Web development community&#8230;if you hire a developer they might have strong feelings about using PHP or Ruby, but they will know HTML backwards and forwards. They are crawled and indexed by search engine bots so that researchers around the world can discover our collections. They are cited in social environments like Twitter, Facebook, blog posts, etc. We have a responsibility to create stable homes (URLs) for our archival descriptions that fit into the Web.</p>
<p>The other reason is these links are important is that they make our investment in EAD visible on the Web for anyone who is looking. Nobody but ArchiveGrid actively crawl EAD XML data. They are the only ones that can find them, because they have been told where they are. If we did a better job of advertising the availability of our EAD documents I think we would see more tools and services around them. ArchiveGrid is a good example of the sort of tool that could be built on top of a web of EAD data. But what about archival collections in your local area? Perhaps it would be useful to have a service that let you look across the archival holdings of institutions in a consortium you belong to. Or perhaps you might want to create an alerting service that lets researchers know what new archival collections are being made available. Or maybe you need to collaborate with archives in a specific domain, and need tools that provide a custom experience for that distributed collection. I imagine there would be lots of ideas for apps if there were just a teensy bit more thought put into how finding aids (both the HTML and the XML) are put on the Web, and how we shared information about their availability.</p>
<p>Going forward I think HTML5 microdata and RDFa present some excellent opportunities for Linked Data representations of finding aids. Especially when you consider some of the vocabulary development <a href="http://purl.org/archival/vocab/arch">being</a> <a href="http://archivi.ibc.regione.emilia-romagna.it/ontology/reference_document/referencedocument.html">done</a> <a href="http://blogs.ukoln.ac.uk/locah/2011/02/16/two-changes-to-the-model-and-some-definitions/">around</a> them; as well as some of the <a href="http://discontents.com.au/tag/linked-data">work</a> being done by Tim Sherratt on using  linked data to create new user experiences around archival data. But if your institution has already invested in creating EAD documents I think trying this link pattern with data you already have could be a good first step towards introducing linked data into your archive. I hope it is a first baby step that archives can take in merging some of the structured data found in the EAD XML document into the HTML they publish about their collections.</p>
<p>I&#8217;m planning on getting the list of EAD publishers into the Wikipedia article for EAD, and putting out a call for others to add their website if it is missing. I also think that a simple crawling and aggregation service that use the links in some fashion could also encourage more linking. A lot of this blog post has been mental preparation for my involvement in an IMLS funded project run out of Tufts that will be looking at Linked Archival Metadata, which is about to be kicked off this winter. If you&#8217;ve read this far, and have any thoughts or suggestions about this I&#8217;d enjoy hearing them either here, on <a href="http://twitter.com/edsu">Twitter</a> or via <a href="mailto:ehs@pobox.com">email</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://inkdroid.org/journal/2012/10/24/level-0-linked-archival-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>who creates the LCNAF (part 2)</title>
		<link>http://inkdroid.org/journal/2012/10/15/who-creates-the-lcnaf-part-2/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=who-creates-the-lcnaf-part-2</link>
		<comments>http://inkdroid.org/journal/2012/10/15/who-creates-the-lcnaf-part-2/#comments</comments>
		<pubDate>Mon, 15 Oct 2012 14:07:21 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[libraries]]></category>
		<category><![CDATA[authority control]]></category>
		<category><![CDATA[cataloging]]></category>
		<category><![CDATA[lcnaf]]></category>
		<category><![CDATA[loc]]></category>

		<guid isPermaLink="false">http://inkdroid.org/journal/?p=5170</guid>
		<description><![CDATA[I ended my A Look at Who Creates the LCNAF post with a hunch that the Library of Congress Name Authority File is increasingly supported by particpants in the Name Authority Cooperative (NACO) rather than by the Library of Congress themself. It didn&#8217;t occur to me until a few days later that I missed a [...]]]></description>
				<content:encoded><![CDATA[<p>I ended my <a href="http://inkdroid.org/journal/2012/10/10/a-look-at-who-makes-the-lcnaf/">A Look at Who Creates the LCNAF</a> post with a hunch that the Library of Congress Name Authority File is increasingly supported by particpants in the Name Authority Cooperative (NACO) rather than by the Library of Congress themself. It didn&#8217;t occur to me until a few days later that I missed a pretty obvious opportunity to graph the number of records created by LC compared with all the other members of the collective. So, here it is:</p>
<p><iframe width="500" height="300" scrolling="no" frameborder="no" src="https://www.google.com/fusiontables/embedviz?viz=GVIZ&amp;t=LINE&amp;gco_vAxes=%5B%7B%22title%22%3A%22Records+Created%22%2C+%22minValue%22%3Anull%2C+%22maxValue%22%3Anull%2C+%22useFormatFromData%22%3Atrue%2C+%22viewWindowMode%22%3A%22pretty%22%2C+%22viewWindow%22%3A%7B%22max%22%3Anull%2C+%22min%22%3Anull%7D%7D%2C%7B%22useFormatFromData%22%3Atrue%2C+%22viewWindowMode%22%3A%22pretty%22%2C+%22viewWindow%22%3A%7B%22max%22%3Anull%2C+%22min%22%3Anull%7D%2C+%22minValue%22%3Anull%2C+%22maxValue%22%3Anull%7D%5D&amp;gco_curveType=function&amp;gco_booleanRole=certainty&amp;gco_lineWidth=2&amp;gco_hAxis=%7B%22useFormatFromData%22%3Atrue%2C+%22minValue%22%3Anull%2C+%22maxValue%22%3Anull%2C+%22viewWindow%22%3Anull%2C+%22viewWindowMode%22%3Anull%2C+%22title%22%3A%22Year%22%7D&amp;gco_legend=right&amp;gco_title=LCNAF+Record+Creation+Overview&amp;containerId=gviz_canvas&amp;isXyPlot=true&amp;q=select+col0%2C+col1%2C+col2+from+1QctNI-hgLhwOO9pE42Ffdw5bQx9i-1iBpI286b4&amp;qrs=+where+col0+%3E%3D+&amp;qre=+and+col0+%3C%3D+&amp;qe=+order+by+col0+asc+limit+32&amp;width=500&amp;height=300"></iframe></p>
<p>It looks like this has been a trend since about 1996 or so. I think it validates the cooperative aspect of the PCC and NACO. Not that it needs any validating. It&#8217;s just nice to see libraries and librarians working together to build something. I guess the name <em>Library of Congress</em> Name Authority File is also increasingly ironic&#8230;</p>
<p><em>Update: thanks to Kevin Ford (who emailed me privately) it seems that LC has been quite aware of this trend, and highlighted the event in 1996 when NACO members began contributing more records than LC with a <a href="http://www.loc.gov/today/pr/1996/96-154.html">press release</a>.</em></p>
]]></content:encoded>
			<wfw:commentRss>http://inkdroid.org/journal/2012/10/15/who-creates-the-lcnaf-part-2/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Always Already New</title>
		<link>http://inkdroid.org/journal/2012/10/15/always-already-new/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=always-already-new</link>
		<comments>http://inkdroid.org/journal/2012/10/15/always-already-new/#comments</comments>
		<pubDate>Mon, 15 Oct 2012 08:46:28 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[book review]]></category>
		<category><![CDATA[books]]></category>
		<category><![CDATA[history]]></category>
		<category><![CDATA[internet]]></category>
		<category><![CDATA[media]]></category>
		<category><![CDATA[phonograph]]></category>
		<category><![CDATA[web]]></category>

		<guid isPermaLink="false">http://inkdroid.org/journal/?p=5162</guid>
		<description><![CDATA[Always Already New: Media, History, And The Data Of Culture by Lisa Gitelman My rating: 3 of 5 stars I enjoyed this book, mainly for the author&#8217;s technique of exploring what media means in our culture by using two examples, separated in time: the phonograph and the Internet. She admits that in some ways this [...]]]></description>
				<content:encoded><![CDATA[<p><a href="http://www.goodreads.com/book/show/1158322.Always_Already_New" style="float: left; padding-right: 20px"><img alt="Always Already New: Media, History, And The Data Of Culture" border="0" src="http://photo.goodreads.com/books/1347774737m/1158322.jpg" /></a><a href="http://www.goodreads.com/book/show/1158322.Always_Already_New">Always Already New: Media, History, And The Data Of Culture</a> by <a href="http://www.goodreads.com/author/show/398692.Lisa_Gitelman">Lisa Gitelman</a><br />
My rating: <a href="http://www.goodreads.com/review/show/393386341">3 of 5 stars</a></p>
<p>I enjoyed this book, mainly for the author&#8217;s technique of exploring what media means in our culture by using two examples, separated in time: the phonograph and the Internet. She admits that in some ways this amounts to comparing apples to oranges, and there is definitely a creative tension in the book. Gitelman&#8217;s emphasis is not that media technologies change society and culture, but that a technology is introduced and is in turn shaped by its particular social and historical context, which then reshapes society and culture.</p>
<blockquote><p>I define media as socially realized structures of communication, where structures include both technological forms and their associated protocols, and where communication is a cultural practice, a ritualized collocation of different people on the same mental map, sharing or engaged with popular ontologies of representation. As such, media are unique and complicated historical subjects.</p></blockquote>
<p>It&#8217;s tempting to talk about media technologies as if their ultimate use is somehow inevitable. For example, Gitelman discusses how the initial commercial placement of the phonograph centered largely around the idea that it would transform dictation and the office. Early demonstrations intended to increase sales of the device focused on recording and playback, rather than simply playback. They didn&#8217;t initially see the  market for recorded music, which would so transform the device. To some extent we&#8217;ve cynically come to expect this out of marketing and &#8220;evangelism&#8221; about media technologies all the time. But this mode of thinking is also present in purely technical discussions, which don&#8217;t account for the placement of the technology in a particular social context. </p>
<p>Getting a sense of the social context you are in the middle of, as opposed to one you one you are historically removed from, presents some challenges. I think this difficulty is more evident in the second part of the book which focuses on the Internet and the World Wide Web against a backdrop of libraries and bibliography. Like many others I imagine, my knowledge of JCR Licklider&#8217;s influence on the development of ARPAnet, and the Internet was largely culled from <a href="http://www.goodreads.com/book/show/281818.Where_Wizards_Stay_Up_Late">Where Wizards Stay Up Late</a>. I had no idea, until reading Always Already New, that Licklider contracted with the Council on Library Resources (now Council on Library and Information Resources) to write a report <em>Libraries of the Future</em> on the topic of how computing would change libraries.</p>
<p>I enjoyed the discussion of the role that the Request for Comment (RFC) played on the Internet. How these documents that were initially shared via the post, helped bootstrap the technologies that would create the Internet that allowed them to be shared as electronic documents or text. I didn&#8217;t know about the <a href="http://www.rfc-editor.org/rfc-online-2008.html" rel="nofollow">RFC-Online</a> project that Jon Postel started right before his death, to recover the earliest RFCs that had been already lost. Gitelman&#8217;s study of linking, citation and &#8220;publishing&#8221; on the Web was also really enjoyable, mainly because of her orientation to these topics:</p>
<blockquote><p>I will argue that far from making history impossible, the interpretive space of the World Wide Web can prompt history in exciting new ways.</p></blockquote>
<p>All this being said, I finished the book with the sneaking feeling that I needed to reread it. Gitelman&#8217;s thesis was subtle enough that it was only when I got to the end that I felt like I understood it: the strange loop that thinking and media participate in, and how difficult (and yet fruitful) it is to talk about media and their social context. Maybe this was also partly the effect of reading it on a Kindle <img src='http://inkdroid.org/journal/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' /> </p>
<p>
<a href="http://www.goodreads.com/review/list/5899086-ed-summers">View all my reviews</a></p>
]]></content:encoded>
			<wfw:commentRss>http://inkdroid.org/journal/2012/10/15/always-already-new/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>learning from people that do</title>
		<link>http://inkdroid.org/journal/2012/10/11/learning-from-people-that-do/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=learning-from-people-that-do</link>
		<comments>http://inkdroid.org/journal/2012/10/11/learning-from-people-that-do/#comments</comments>
		<pubDate>Thu, 11 Oct 2012 17:15:12 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[programming]]></category>

		<guid isPermaLink="false">http://inkdroid.org/journal/?p=5136</guid>
		<description><![CDATA[Anil Dash recently wrote a nice piece about the need for what he calls a Hi-Tech Vo-tech in the technology sector. If you are not familiar with it already, Vo-Tech is shorthand in the US for Vocational-technical school, which provide focused training in specific areas, often on a part time basis. The Vo-Tech experience is [...]]]></description>
				<content:encoded><![CDATA[<p>Anil Dash recently <a href="http://dashes.com/anil/2012/10/the-blue-collar-coder.html">wrote</a> a nice piece about the need for what he calls a Hi-Tech Vo-tech in the technology sector. If you are not familiar with it already, Vo-Tech is shorthand in the US for <a href="http://en.wikipedia.org/wiki/Vocational-technical_school">Vocational-technical school</a>, which provide focused training in specific areas, often on a part time basis. The Vo-Tech experience is markedly different from the typical 4 year university experience, which tends to be focused more on theory than practice.</p>
<p>I <em>totally</em> agree. </p>
<p>But if you are looking to work as a software developer, and to help build this amazing information space we call the World Wide Web, you don&#8217;t need to wait for this dream of a better high school curriculum for computer programming, or Hi-Tech Vo-Techs to come to your town. I don&#8217;t want to minimize the effort involved in finding your way into the workplace&#8230;it&#8217;s hard, especially when there is competition from &#8220;qualified&#8221; candidates, and the skill sets seem to be constantly shifting. But here are some relatively simple steps you can take to get started.</p>
<h2>Look at Job Ads</h2>
<p>Go to the <a href="http://craigslist.org">CraigsList</a> for your area, look at what jobs are available under the <em>internet engineers</em> and <em>software / qa / dba</em> sections. I suggest Craigslist because of their local flavor, and the low cost to advertise, which typically means the jobs are at smaller companies who are less interested in finding someone with the right college degree, and more interested in finding someone who can get things done. Look for jobs that focus on what you can do rather than schooling. Don&#8217;t apply for any of the jobs just yet. Note down the tools they want people to know: computer languages, operating systems, web frameworks, etc. Research them on <a href="http://wikipedia.org">Wikipedia</a>. Focus on tools that seem to pop up a lot, are opensource, and can be downloaded and used without cost. You don&#8217;t need to do anything with them just yet though.</p>
<h2>Go To User Group Meetings</h2>
<p>I say opensource because opensource tools often have open communities around them. You should be able to find user groups in your area where people present on how they use these tools at their place of work. You might have to drive a while, or take a long bus/train ride &#8212; but it&#8217;s worth it. To find the meetings do some searches by technology and location on <a href="http://meetup.com">Meetup</a>. Alternatively you can Google for whatever the technology is + &#8220;user group&#8221; + your area (e.g. Philadelphia) and go through a few pages of results. At a user group meeting you will not only learn about the details of the technology, but you will meet actual, real people who are using it. There are often subtle differences in the cultures and communities of practice around software tools. Some user groups will feel more comfortable than others. Pay attention to your gut reactions&#8211;they are indicators of how much you would like a job working with the technology, and the people who like it. If you get a bad vibe, don&#8217;t take it personally, try another meeting. Finding a job is often a matter of who you know, not what you know &#8230; and user groups are a great place to get to know people working in the software development field. There&#8217;s no online substitute for meeting people in real life.</p>
<h2>Use Social Networks</h2>
<p>At user group meetings you meet people who you can learn from. See if they have a blog, are on Twitter or Facebook. Maybe they use a social bookmarking tool you can follow. Or perhaps there are email discussion lists you can subscribe to. It&#8217;s not stalking, these people are your mentors, learn from them. Take a dip into sites like <a href="http://news.ycombinator.com/">Hacker News</a> or <a href="http://www.reddit.com/r/programming/">Programming Reddit</a>. Watch the trends, you aren&#8217;t being a fanboy/girl, you are learning about what people care about in the field. Don&#8217;t feel bad if it&#8217;s overwhelming (it&#8217;s overwhelming to &#8220;experts&#8221; too), focus on what seems interesting. Also, cultivate your own online identity by posting stuff that you are interested in, or have questions about. Stay positive, and try not to bash things: people (and potential employers) are watching you the same way you are watching them.</p>
<h2>Read</h2>
<p>Sometimes the speakers at User Group meetings will also be authors of books. You will see books reviewed on sites like Hacker News. People you follow may mention the books they read, or have accounts on sites like <a href="http://goodreads.com">GoodReads</a>. See if a library or a bookstore has them, and go skim them. Buy or borrow the ones you like. Take notes about them online, so people can see your interests. Get a Google Reader account and follow blogs related to tools you would like to use. Look for tools that have approachable/readable tutorials. Try out the examples, and get a feel for how well the theory of the tutorial translates into practice. If tools don&#8217;t install or seem to work the way they are described, don&#8217;t feel like you did something wrong&#8230;move on to tools that work more smoothly, and fit your brain better. The benefit to focusing on opensource projects is that you will find more content about them online. You can can read code. Reading the source code for Ruby or GoLang is definitely not for the faint of heart, though it&#8217;s nice you can do it. It&#8217;s more important that you look at code that uses these tools. Go to <a href="http://github.com">GitHub</a> and see what projects there are that use the tool. Browse the source online, or clone the repositories to your workstation. See if you can help out with some low hanging fruit tasks in their issue queue.</p>
<h2>Find a Niche</h2>
<p>You are probably interested in things other than programming. For example I like libraries and archives, and the cultural heritage sector. I&#8217;ve found a virtual community of software developers in this area called <a href="http://code4lib.org">code4lib</a>, which helps me learn more about new projects, tools in the field, and is a way to get to know people. You may be surprised to find a similar community around something you are interested in: be it astrophysics, cartoons, music, maps, real estate, etc. If you don&#8217;t find one, maybe think about starting one up&#8211;you might be surprised by how many people turn up. Sometimes there are collaborative projects that need your help like Wikipedia, <a href="http://www.openstreetmap.org/">Open Street Map</a> where the ability to automate mundane tasks is needed. You might not get paid for this work, but it will broaden your circle of contacts, deepen your technical skills, will build your self confidence, and will be something to put on your resume. The key thing that finding a niche can do is make your job search a bit easier, since technology skills cut across domains. You will also find that your niche has a particular set of tools that it likes to use. These typically aren&#8217;t hard and fast rules about using X instead of Y, but are norms. Pay attention to them, and learn about things that interest you.</p>
<h2>Be Confident</h2>
<p>I don&#8217;t mean to imply any of this is easy. It can be extremely difficult to get out of your comfort zone and explore things you don&#8217;t know. But you will be rewarded for your efforts, by learning from people who actually do things in the world. I&#8217;ve worked with some really excellent software developers that didn&#8217;t have a compsci degree, and some that I wasn&#8217;t even sure if they graduated high school. Sometimes I wonder if I even graduated from high school. So be confident in your ability to learn and do this thing we call software development. Show that you are humble about what you don&#8217;t know, and that you are hungry to learn it. Above all, don&#8217;t buy into the cult of the &#8220;real programmer&#8221; &#8230; she doesn&#8217;t exist. There are just people to learn from, and if you are doing it right, you never stop learning.</p>
]]></content:encoded>
			<wfw:commentRss>http://inkdroid.org/journal/2012/10/11/learning-from-people-that-do/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>a look at who makes the LCNAF</title>
		<link>http://inkdroid.org/journal/2012/10/10/a-look-at-who-makes-the-lcnaf/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=a-look-at-who-makes-the-lcnaf</link>
		<comments>http://inkdroid.org/journal/2012/10/10/a-look-at-who-makes-the-lcnaf/#comments</comments>
		<pubDate>Wed, 10 Oct 2012 11:22:37 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[libraries]]></category>
		<category><![CDATA[semweb]]></category>
		<category><![CDATA[4store]]></category>
		<category><![CDATA[authority control]]></category>
		<category><![CDATA[ec2]]></category>
		<category><![CDATA[lcnaf]]></category>
		<category><![CDATA[library of congress]]></category>

		<guid isPermaLink="false">http://inkdroid.org/journal/?p=5070</guid>
		<description><![CDATA[As a follow up to my last post about visualizing Library of Congress Name Authority File (LCNAF) records created by year, I decided to dig a little bit deeper to see how easy it would be to visualize how participating Name Authority Cooperative institutions have contributed to the LCNAF over time. This idea was mostly [...]]]></description>
				<content:encoded><![CDATA[<p>As a follow up to my <a href="http://inkdroid.org/journal/2012/10/04/lcnaf-unix-hack/">last post</a> about visualizing Library of Congress Name Authority File (LCNAF) records created by year, I decided to dig a little bit deeper to see how easy it would be to visualize how participating <a href="http://www.loc.gov/aba/pcc/naco/">Name Authority Cooperative</a> institutions have contributed to the LCNAF over time. This idea was mostly born out of spending the latter part of last week participating in a conversation about the need for a <a href="http://socialarchive.iath.virginia.edu/NAAC_index.html">National Archival Authority Cooperative</a> hosted at NARA. This blog post is one part nerdy technical notes on how I worked with the LCNAF Linked Data, and one part line charts showing who creates and modifies LCNAF records. It might&#8217;ve made more sense to start with the pretty charts, and then show you how I did it&#8230;but if the tech details don&#8217;t interest you can jump to the <a href="#result">second half</a>.</p>
<h2>The Work</h2>
<p>After a very helpful Twitter <a href="https://twitter.com/3windmills/status/254949232052686848">conversation</a> with Kevin Ford I discovered that the Linked Data <a href="http://www.loc.gov/standards/mads/rdf/">MADSRDF</a> representation of the LCNAF includes assertions about the institution responsible for creating or revising the a record. Here&#8217;s a snippet of Turtle for RDF that describes who created and modified the LCNAF record for <a href="http://id.loc.gov/authorities/names/n97108433">J. K. Rowling</a> (if your eyes glaze over when you see RDF, don&#8217;t worry keep reading, it&#8217;s not essential you understand this):</p>
<pre>
@prefix ri: &lt;http://id.loc.gov/ontologies/RecordInfo#&gt; .

&lt;http://id.loc.gov/authorities/names/n97108433&gt;
    madsrdf:adminMetadata [
        ri:recordChangeDate "1997-10-28T00:00:00"^^&lt;http://www.w3.org/2001/XMLSchema#dateTime&gt; ;
        ri:recordContentSource &lt;http://id.loc.gov/vocabulary/organizations/dlc&gt; ;
        ri:recordStatus "new"^^&lt;http://www.w3.org/2001/XMLSchema#string&gt; ;
        a ri:RecordInfo
    ],
    [
        ri:recordChangeDate "2011-08-25T06:29:06"^^&lt;http://www.w3.org/2001/XMLSchema#dateTime&gt; ;
        ri:recordContentSource &lt;http://id.loc.gov/vocabulary/organizations/dlc&gt; ;
        ri:recordStatus "revised"^^&lt;http://www.w3.org/2001/XMLSchema#string&gt; ;
        a ri:RecordInfo
    ] .
</pre>
<p>So I picked up an <a href="http://aws.amazon.com/ec2/instance-types/">EC2 m1.large</a> spot instance (7.5G of RAM, 2 virtual cores, 850G of storage) for a miserly $0.026/hour, installed 4store (which is a triplestore I&#8217;d heard good things about), and loaded the data.</p>
<pre>
% wget http://id.loc.gov/static/data/authoritiesnames.nt.madsrdf.gz
% gunzip authoritiesnames.nt.madsrdf.gz
% sudo apt-get install 4store
% sudo mkdir /mnt/4store
% sudo chown fourstore:fourstore /mnt/4store
% sudo ln -s /mnt/4store /var/lib/4store
% sudo -u fourstore 4s-backend-setup lcnaf --segments 4
% sudo -u fourstore 4s-backend lcnaf
% sudo -u fourstore 4s-import --verbose lcnaf authoritiesnames.nt.madsrdf
</pre>
<p>I used 4 segments as a best guess to match the 4 EC2 compute units available to an m1.large. The only trouble was that after loading 90M of the 226M assertions it began to slow to a crawl as the memory was about used up. </p>
<p>I thought briefly about upgrading to a larger instance&#8230;but it occurred to me that I actually didn&#8217;t need all the triples. I just need the ones related to the record changes, and the organization that made them. So I filtered out just the assertions I needed. By the way, this is a really nice artifact of the ntriples data format, which is very easy to munge with line oriented Unix utilities and scripting tools:</p>
<pre>
zcat authoritiesnames.nt.madsrdf.gz | egrep '(recordChangeDate)|(recordContentSource)|(recordStatus)'  > updates.nt
</pre>
<p>This left me with 50,313,810 triples which loaded in about 20 minutes! With the database populated I was then able to execute the following query to fetch all the create dates with their institution code using 4s-query:</p>
<pre>
@prefix ri: &lt;http://id.loc.gov/ontologies/RecordInfo#&gt; .

SELECT ?date ?source WHERE { 
  ?s ri:recordChangeDate ?date . 
  ?s ri:recordContentSource ?source . 
  ?s ri:recordStatus "new"^^&lt;http://www.w3.org/2001/XMLSchema#string&gt; . 
}
</pre>
<p>This returned a tab delimited file that looked something like:</p>
<pre>
"1991-08-16T00:00:00"^^&gt;http://www.w3.org/2001/XMLSchema#dateTime&gt;      &lt;http://id.loc.gov/vocabulary/organizations/dlc&gt;
"1995-01-07T00:00:00"^^&lt;http://www.w3.org/2001/XMLSchema#dateTime&gt;      &lt;http://id.loc.gov/vocabulary/organizations/djbf&gt;
"2004-03-04T00:00:00"^^&lt;http://www.w3.org/2001/XMLSchema#dateTime&gt;      &lt;http://id.loc.gov/vocabulary/organizations/nic&gt;
</pre>
<p>I then wrote a simplistic <a href="https://gist.github.com/3863082">python program</a> to read in the TSV file and output a table of data where each row represented a year and the columns were the institution codes. </p>
<h2 id="result">The Result</h2>
<p>If you&#8217;d like to see the table you can check it out as a <a href="https://www.google.com/fusiontables/DataSource?docid=1IcIRXt-H76hrGyGDc7P1ngQKDzgBkpmRXTXE1E8">Google Fusion Table</a>. If you are interested, you should be able to easily pull the data out into your own table, modify it, and visualize it. Google Fusion tables can be really easily rendered in a variety of ways, including a line graph, which I&#8217;ve embedded here, just displaying the top 25 contributors:</p>
<p><iframe width="600" height="400" scrolling="no" frameborder="no" src="https://www.google.com/fusiontables/embedviz?viz=GVIZ&amp;t=LINE&amp;gco_vAxes=%5B%7B%22title%22%3A%22%22%2C+%22minValue%22%3A0%2C+%22maxValue%22%3Anull%2C+%22useFormatFromData%22%3Atrue%2C+%22viewWindowMode%22%3A%22explicit%22%2C+%22viewWindow%22%3A%7B%22max%22%3Anull%2C+%22min%22%3A0%7D%7D%2C%7B%22useFormatFromData%22%3Atrue%2C+%22viewWindowMode%22%3A%22pretty%22%2C+%22viewWindow%22%3A%7B%22max%22%3Anull%2C+%22min%22%3Anull%7D%2C+%22minValue%22%3Anull%2C+%22maxValue%22%3Anull%7D%5D&amp;gco_curveType=function&amp;gco_booleanRole=certainty&amp;gco_lineWidth=2&amp;gco_hAxis=%7B%22useFormatFromData%22%3Atrue%2C+%22minValue%22%3Anull%2C+%22maxValue%22%3Anull%2C+%22viewWindow%22%3Anull%2C+%22viewWindowMode%22%3Anull%2C+%22title%22%3A%22Year%22%7D&amp;gco_legend=right&amp;gco_title=LCNAF+Records+Created&amp;gco_legendTextStyle=%7B%22color%22%3A%22%23222%22%2C+%22fontSize%22%3A%2210%22%7D&amp;containerId=gviz_canvas&amp;isXyPlot=true&amp;q=select+col0%2C+col1%2C+col2%2C+col3%2C+col4%2C+col5%2C+col6%2C+col7%2C+col8%2C+col9%2C+col10%2C+col11%2C+col12%2C+col13%2C+col14%2C+col15%2C+col16%2C+col17%2C+col18%2C+col19%2C+col20%2C+col21%2C+col22%2C+col23%2C+col24%2C+col25+from+1IcIRXt-H76hrGyGDc7P1ngQKDzgBkpmRXTXE1E8&amp;qrs=+where+col0+%3E%3D+&amp;qre=+and+col0+%3C%3D+&amp;qe=+order+by+col0+asc+limit+32&amp;width=600&amp;height=400"></iframe></p>
<p>While I didn&#8217;t quite expect to see LC tapering off the way it is, I did expect it to dominate the graph. <a href="https://www.google.com/fusiontables/DataSource?docid=1kPtQOFcF6wY9FbbVAHZcCJIZTqgVSa1_X0IfAPs">Removing LC</a> from the mix makes the graph a little bit more interesting. For example you can see the steady climb of the British Library, and the strong role that Princeton University plays:</p>
<p><iframe width="600" height="400" scrolling="no" frameborder="no" src="https://www.google.com/fusiontables/embedviz?viz=GVIZ&amp;t=LINE&amp;gco_vAxes=%5B%7B%22title%22%3A%22Records%22%2C+%22minValue%22%3A0%2C+%22maxValue%22%3Anull%2C+%22useFormatFromData%22%3Atrue%2C+%22viewWindowMode%22%3A%22explicit%22%2C+%22viewWindow%22%3A%7B%22max%22%3Anull%2C+%22min%22%3A0%7D%7D%2C%7B%22useFormatFromData%22%3Atrue%2C+%22viewWindowMode%22%3A%22pretty%22%2C+%22viewWindow%22%3A%7B%22max%22%3Anull%2C+%22min%22%3Anull%7D%2C+%22minValue%22%3Anull%2C+%22maxValue%22%3Anull%7D%5D&amp;gco_curveType=function&amp;gco_booleanRole=certainty&amp;gco_lineWidth=2&amp;gco_hAxis=%7B%22useFormatFromData%22%3Atrue%2C+%22minValue%22%3Anull%2C+%22maxValue%22%3Anull%2C+%22viewWindow%22%3Anull%2C+%22viewWindowMode%22%3Anull%2C+%22title%22%3A%22Year%22%7D&amp;gco_legend=right&amp;gco_title=LCNAF+Records+Created+(No+LC)&amp;gco_legendTextStyle=%7B%22color%22%3A%22%23222%22%2C+%22fontSize%22%3A%2210%22%7D&amp;containerId=gviz_canvas&amp;isXyPlot=true&amp;q=select+col0%2C+col2%2C+col3%2C+col5%2C+col6%2C+col7%2C+col8%2C+col9%2C+col10%2C+col12%2C+col13%2C+col14%2C+col15%2C+col16%2C+col17%2C+col18%2C+col19%2C+col20%2C+col22%2C+col23%2C+col24%2C+col25%2C+col26%2C+col27%2C+col28%2C+col29%2C+col30+from+1kPtQOFcF6wY9FbbVAHZcCJIZTqgVSa1_X0IfAPs&amp;qrs=+where+col0+%3E%3D+&amp;qre=+and+col0+%3C%3D+&amp;qe=+order+by+col0+asc+limit+32&amp;width=600&amp;height=400"></iframe></p>
<p>Out of curiosity I then executed a SPARQL query for record updates (or revisions), repeated the step with stats.py, uploaded to <a href="https://www.google.com/fusiontables/DataSource?docid=1rVZNLnEWzCoSEj9jAs40ZX9WjJ25BnA0GYIuTUo">Google Fusion Tables</a>, and removed LC to better see trends in who is updating records:</p>
<pre>
@prefix ri: &lt;http://id.loc.gov/ontologies/RecordInfo#&gt; .

SELECT ?date ?source WHERE { 
  ?s ri:recordChangeDate ?date . 
  ?s ri:recordContentSource ?source . 
  ?s ri:recordStatus "revised"^^&lt;http://www.w3.org/2001/XMLSchema#string&gt; . 
}
</pre>
<p><iframe width="600" height="400" scrolling="no" frameborder="no" src="https://www.google.com/fusiontables/embedviz?viz=GVIZ&amp;t=LINE&amp;gco_vAxes=%5B%7B%22title%22%3A%22Records%22%2C+%22minValue%22%3A0%2C+%22maxValue%22%3Anull%2C+%22useFormatFromData%22%3Atrue%2C+%22viewWindowMode%22%3A%22explicit%22%2C+%22viewWindow%22%3A%7B%22max%22%3Anull%2C+%22min%22%3A0%7D%7D%2C%7B%22useFormatFromData%22%3Atrue%2C+%22viewWindowMode%22%3A%22pretty%22%2C+%22viewWindow%22%3A%7B%22max%22%3Anull%2C+%22min%22%3Anull%7D%2C+%22minValue%22%3Anull%2C+%22maxValue%22%3Anull%7D%5D&amp;gco_curveType=function&amp;gco_booleanRole=certainty&amp;gco_lineWidth=2&amp;gco_hAxis=%7B%22useFormatFromData%22%3Atrue%2C+%22minValue%22%3Anull%2C+%22maxValue%22%3Anull%2C+%22viewWindow%22%3Anull%2C+%22viewWindowMode%22%3Anull%2C+%22title%22%3A%22Year%22%2C+%22titleTextStyle%22%3A%7B%22color%22%3A%22%23222%22%2C+%22fontSize%22%3A%2212%22%2C+%22italic%22%3Atrue%7D%7D&amp;gco_legend=right&amp;gco_title=LCNAF+Records+Revised&amp;gco_legendTextStyle=%7B%22color%22%3A%22%23222%22%2C+%22fontSize%22%3A%2210%22%7D&amp;containerId=gviz_canvas&amp;isXyPlot=true&amp;q=select+col0%2C+col3%2C+col4%2C+col5%2C+col6%2C+col7%2C+col8%2C+col10%2C+col11%2C+col12%2C+col13%2C+col14%2C+col15%2C+col16%2C+col17%2C+col18%2C+col20%2C+col21%2C+col22%2C+col23%2C+col24%2C+col25%2C+col26%2C+col27%2C+col28%2C+col29%2C+col30+from+1rVZNLnEWzCoSEj9jAs40ZX9WjJ25BnA0GYIuTUo&amp;qrs=+where+col0+%3E%3D+&amp;qre=+and+col0+%3C%3D+&amp;qe=+order+by+col0+asc+limit+30&amp;width=600&amp;height=400"></iframe></p>
<p>I definitely never understood what <a href="http://en.wikipedia.org/wiki/Twin_Peaks">Twin Peaks</a> was about, and I similarly don&#8217;t really know what the twin peaks in this graph signify (2000 and 2008). I guess these were years where there were a lot of coordinated edits? Perhaps some NACO folks who have been around for a few years may know the answer. You can also see in this graph that Princeton University plays a strong role in updating records as well as creating them.</p>
<p>So I&#8217;m not sure I understand the how/when/why of an NAAC any better, but I did learn:</p>
<ul>
<li>EC2 is a big win for quick data munging projects like this. I spent $0.98 with the instance up and running for 3 days.</li>
<li>Filtering ntriples files to what you actually need prior to loading into a triplestore can save time, money.</li>
<li>Working with ntriples is still pretty esoteric, and the options out there for processing a dump of ntriples (or rdf/xml) of LCNAF&#8217;s size are truly slim. If I&#8217;m wrong about this I would like to be corrected.</li>
<li>Google Fusion Tables are a nice way to share data and charts.</li>
<li>It seems like while <a href="http://inkdroid.org/journal/2012/10/04/lcnaf-unix-hack/">more LCNAF records are being created per year</a>, they are being created by a broader base of institutions instead of just LC (who appear to be in decline). I think this is a good sign for NAAC.</li>
<li>Open Data, and Open Data Curators (thanks Kevin) are essential to open, collaborative enterprises.</li>
</ul>
<p>Now I could&#8217;ve made some hideous mistakes here, so in the unlikely event you have the time and inclination I would be interested to hear if you can reproduce these results. If the results confirm or disagree with other views of LCNAF participation I would be interested to see them.</p>
]]></content:encoded>
			<wfw:commentRss>http://inkdroid.org/journal/2012/10/10/a-look-at-who-makes-the-lcnaf/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>lcnaf unix hack</title>
		<link>http://inkdroid.org/journal/2012/10/04/lcnaf-unix-hack/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=lcnaf-unix-hack</link>
		<comments>http://inkdroid.org/journal/2012/10/04/lcnaf-unix-hack/#comments</comments>
		<pubDate>Fri, 05 Oct 2012 03:02:10 +0000</pubDate>
		<dc:creator>ed</dc:creator>
				<category><![CDATA[libraries]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[unix. lcnaf]]></category>

		<guid isPermaLink="false">http://inkdroid.org/journal/?p=5046</guid>
		<description><![CDATA[I was in a meeting today listening to a presentation about the Library of Congress Name Authority File and I got it into my head to see if I could quickly graph record creation by year. Part of this might&#8217;ve been prompted by sitting next to Kevin Ford, who was multi-tasking by what looked like [...]]]></description>
				<content:encoded><![CDATA[<p>I was in a <a href="http://socialarchive.iath.virginia.edu/NAAC_meeting2_agenda.html">meeting</a> today listening to a presentation about the Library of Congress Name Authority File and I got it into my head to see if I could quickly graph record creation by year. Part of this might&#8217;ve been prompted by sitting next to Kevin Ford, who was multi-tasking by what looked like loading some MARC data into id.loc.gov. I imagine this isn&#8217;t perfect, but I thought it was kind of fun hack that demonstrates what you can get away with on the command line with some open data:</p>

<div class="wp_syntax"><table><tr><td class="code"><pre class="bash" style="font-family:monospace;">  curl http:<span style="color: #000000; font-weight: bold;">//</span>id.loc.gov<span style="color: #000000; font-weight: bold;">/</span>static<span style="color: #000000; font-weight: bold;">/</span>data<span style="color: #000000; font-weight: bold;">/</span>authoritiesnames.nt.skos.gz \
    <span style="color: #000000; font-weight: bold;">|</span> <span style="color: #c20cb9; font-weight: bold;">zcat</span> - \
    <span style="color: #000000; font-weight: bold;">|</span> <span style="color: #c20cb9; font-weight: bold;">perl</span> <span style="color: #660033;">-ne</span> <span style="color: #ff0000;">'/terms\/created&gt; &quot;(\d{4})-\d{2}-\d{2}/; print &quot;$1\n&quot; if $1;'</span> \
    <span style="color: #000000; font-weight: bold;">|</span> <span style="color: #c20cb9; font-weight: bold;">sort</span> \
    <span style="color: #000000; font-weight: bold;">|</span> <span style="color: #c20cb9; font-weight: bold;">uniq</span> <span style="color: #660033;">-c</span> \
    <span style="color: #000000; font-weight: bold;">|</span> <span style="color: #c20cb9; font-weight: bold;">perl</span> <span style="color: #660033;">-ne</span> <span style="color: #ff0000;">'chomp; @cols = split / +/; print &quot;$cols[2]\t$cols[1]\n&quot;;'</span> \
    <span style="color: #000000; font-weight: bold;">&gt;</span> lcnaf-years.tsv</pre></td></tr></table></div>

<p>Which yields a tab delimited file where column 1 is the year and column 2 is the number of records created in that year. The key part is the perl one-liner on line 3 which looks for assertions like this in the ntriples rdf, and pulls out the year:</p>
<pre>
&lt;http://id.loc.gov/authorities/names/n90608287&gt; &lt;http://purl.org/dc/terms/created&gt; "1990-02-05T00:00:00"^^&lt;http://www.w3.org/2001/XMLSchema#dateTime&gt; .
</pre>
<p>The use of <code>sort</code> and <code>uniq -c</code> together is a handy trick my old boss Fred Lindberg taught me, for quickly generating aggregate counts from a stream of values. It works surprisingly well with quite large sets of values, because of all the work that has gone into making <code>sort</code> efficient.</p>
<p>WIth the tsv in hand I trimmed the pre-1980 values, since I think there are lots of records attributed to 1980 since that&#8217;s when OPAC came online, and I wasn&#8217;t sure what the dribs and drabs prior to 1980 represented. Then I dropped the data into ye olde chart maker (in this case GoogleDocs) and voilà:</p>
<p><img src="http://inkdroid.org/images/lcnaf-record-creation.png"/></p>
<p>It would be more interesting to see the results broken out by contributing NACO institution, but I don&#8217;t think that data is in the various RDF representations. I don&#8217;t even know if the records contributed by other NACO institutions are included in the LCNAF. I imagine a similar graph is available somewhere else, but it was neat that the availability of the LCNAF data meant I could get a rough answer to this passing question fairly quickly.</p>
<p>The numbers add up to ~7.8 million which seems within the realm of possibile correctness. But if you notice something profoundly wrong with this display please let me know!</p>
]]></content:encoded>
			<wfw:commentRss>http://inkdroid.org/journal/2012/10/04/lcnaf-unix-hack/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

<!-- Dynamic page generated in 10.492 seconds. -->
<!-- Cached page generated by WP-Super-Cache on 2013-05-23 12:08:35 -->
