Paris Review Interviews and Wikipedia

I was recently reading an amusing piece by David Dobbs about William Faulkner being a tough interview. Dobbs has been working through the Paris Review archive of interviews which are available on the Web. The list of authors is really astonishing, and the interviews are great examples of longform writing on the Web.

The 1965 interview with William S. Burroughs really blew me away. So much so that I got to wondering how many Wikipedia articles reference these interviews.

A few years ago, I experimented with a site called Linkypedia for visualizing how a particular website is referenced on Wikipedia. It’s actually pretty easy to write a script to see what Wikipedia articles point at a Website, and I’ve done it enough times that it was convenient to wrap it up in a little Python module.

from wplinks import extlinks 
for src, target in extlinks(''):
    print wikipedia_url, website_url

But I wanted to get a picture not only of what Wikipedia articles pointed at the Paris Review, but also Paris Review interviews which were not referenced in Wikipedia. So I wrote a little crawler that collected all the Paris Review interviews, and then figured out which ones were pointed at by English Wikipedia.

This was also an excuse to learn about JSON-LD, which became a W3C Recommendation a few weeks ago. I wanted to use JSON-LD to serialize the results of my crawling as an RDF graph so I could visualize the connections between authors, their interviews, and each other (via influence links that can be found on dbpedia) using D3’s Force Layout. Here’s a little portion of the larger graph, which you can find by clicking on it.

As you can see it’s a bit of a hairball. If you want to have a go at visualizing the data the JSON-LD can be found here. The blue nodes are Wikipedia articles, the white and red nodes are Paris Review interviews. The red ones are interviews that are not yet linked to from Wikipedia. 322 of the 362 interviews are already linked to Wikipedia. Here is the list of 40 that still need to be linked, in the unlikely event that you are a Wikipedian looking for something to do:

I ran into my friend Dan over coffee who sketched out a better way to visualize the relationships between the writers, the interviews and the time periods. Might be a good excuse to get a bit more familiar with D3 …

@dchud sketch

recent Wikipedia citations as JSON

Here is a little webcast about some work in progress to stream recent citations out of Wikipedia. It uses previous work I did on the wikichanges Node library. Beware, I say “um” and “uh” a lot while showing you my terminal window. This idea could very well be brain damaged since it pings the Wikipedia API for the diff of each change in selected Wikipedias, to see if it contains one or more citations. On the plus side, it emits the citations as JSON, which is suitable for downstream apps of some dimensions, which I haven’t thought much about yet. Get in touch if you have some ideas.