Broken World

You know that tingly feeling you get when you read something, look at a picture, or hear a song that subtly and effortlessly changes the way you think?

I don’t know about you, but for me thoughts, ideas and emotions can often feel like puzzles that stubbornly demand a solution, until something or someone helps make the problem evaporate or dissolve. Suddenly I can zoom in, out or around the problem, and it is utterly transformed. As that philosophical trickster Ludwig Wittgenstein wrote:

It is not surprising that the deepest problems are in fact not problems at all.

A few months ago, a tweet from Matt Kirschenbaum had this effect on me.

It wasn’t the tweet itself, so much as what the tweet led to: Steven Jackson‘s Rethinking Repair, which recently appeared in the heady sounding Media Technologies Essays on Communication, Materiality, and Society.

I’ve since read the paper three or four times, taken notes, underlined stuff to follow up on, etc. I’ve been meaning to write about it here, but I couldn’t start…I suppose for reasons that are (or will become) self evident. The paper is like a rhizome that brings together many strands of thought and practice that are directly relevant to my personal and professional life.

I’ve spent the last decade or so working as a software developer in the field of digital preservation and archives. On good days this seems like a surprisingly difficult thing, and on the bad days it seems more like an oxymoron…or an elaborate joke.

At home I’ve spent the last year helping my wife start a business to change our culture one set of hands at a time, while trying to raise children in a society terminally obsessed with waste, violence and greed…and a city addicted to power, or at least the illusion of power.

In short, how do you live and move forward amidst such profound brokenness? Today Quinn Norton‘s impassioned Everything is Broken reminded me of Jackson’s broken world thinking, and what a useful hack (literally) it is…especially if you are working with information technology today.

Writing software, especially for the Web, is still fun, even after almost 20 years. It keeps changing, spreading into unexpected places, and the tools just keep evolving, getting better, and more varied. But this same dizzying rate of change and immediacy poses some real problems if you are concerned about stuff staying around so people can look at it tomorrow.

When I was invited to the National Digital Forum I secretly fretted for months, trying to figure out if I had anything of substance to say to that unique blend of folks interested in the cross section of the Web and the cultural heritage. The thing I eventually landed on was taking a look at the Web as a preservation medium, or rather the Web as a process, which has a preservation component to it. In the wrapup I learned that the topic of “web preservation” had already been covered a few years earlier, so there wasn’t much new there … but there was some evidence that the talk connected with a few folks.

If I could do it all again I would totally (as Aaron would say) look at the Web and preservation through Jackson’s prism of broken world thinking.

The bit where I talked about how Mark Pilgrim and Why’s online presence was brought back from virtual suicide using Git repositories, the Internet Archive and a lot of TLC was totally broken world thinking. Verne Harris’ notion that the archive is always just a sliver of a sliver of a sliver of a window into process, and that as such it is extremely, extremely valuable is broken world thinking. Or the evolution of permalinks, cool URIs in the face of swathes of linkrot is at its heart broken world thinking.

The key idea in Jackson’s article (for me) is that there are very good reasons to remain hopeful and constructive while at the same time being very conscious of the problems we find ourself in today. The ethics of care that he outlines, with roots in the feminist theory, is a deeply transformative idea. I’ve got lots of lines of future reading to follow, in particular in the area of sustainability studies, which seems very relevant to the work of digital preservation.

But most of all Jackson’s insight that innovation doesn’t happen in lightbulb moments (the mythology of the Silicon Valley origin story) or the latest tech trend, but in the recognition of brokenness, and the willingness to work together with others to repair and fix it. He positions repair as an ongoing process that fuels innovation:

… broken world thinking asserts that breakdown, dissolution, and change, rather than innovation, development, or design as conventionally practiced and thought about are the key themes and problems facing new media.

I should probably stop there. I know I will return to this topic again, because I feel like a lot of my previous writing here has centered on the importance of repair, without me knowing it. I just wanted to stop for a moment, and give a shout out to some thinking that I’m suspecting will guide me for the next twenty years.

Where Brooklyn At?

As a follow up to my last post I added a script to my fork of Aaron’s py-flarchive that will load up a Redis instance with comments, notes, tags and sets for Flickr images that were uploaded by Brooklyn Museum. The script assumes you’ve got a snapshot of the archived metadata, which I downloaded as a tarball. It took several hours to unpack the tarball on a medium ec2 instance; so if you want to play around and just want the redis database let me know and I’ll get it to you.

Once I loaded up Redis I was able to generate some high level stats:

  • images: 5,697
  • authors: 4,617
  • tags: 6,132
  • machine tags: 933
  • comments: 7,353
  • notes: 963
  • sets: 141

Given how many images there were there it represents an astonishing number of authors: unique people who added tags, comments or notes. If you are curious I generated a list of the tags and saved them as a Google Doc. The machine tags were particularly interesting to me. The majority (849) of them look like Brooklyn Museum IDs of some kind, for example:


But there were also 51 geotags, and what looks like 23 links to items in Pleiades, for example:


If I had to guess I’d say this particular machine tag indicated that the Brooklyn Museum image depicted Abu Simbel. Now there weren’t tons of these machine tags but it’s important to remember that other people use Flickr as a scratch space for annotating images this way.

If you aren’t familiar with them, Flickr notes are annotations of an image, where the user has attached a textual note to a region in the image. Just eyeballing the list, it appears that there is quite a bit of diversity in them, ranging from the whimsical:

  • cool! they look soo surreal
  • teehee somebody wrote some graffiti in greek
  • Lol are these painted?
  • Steaks are ready!

to the seemingly useful:

  • Hunter’s Island
  • Ramesses III Temple
  • Lapland Village
  • Lake Michigan
  • Montuemhat Crypt
  • Napoleon’s troops are often accused of destroying the nose, but they are not the culprits. The nose was already gone during the 18th century.

Similarly the general comments run the gamut from:

  • very nostalgic…
  • always wanted to visit Egypt


  • Just a few points. This is not ‘East Jordan’ it is in the Hauran region of southern Syria. Second it is not Qarawat (I guess you meant Qanawat) but Suweida. Third there is no mention that the house is enveloped by the colonnade of a Roman peripteral temple.
  • The fire that destroyed the buildings was almost certainly arson. it occurred at the height of the Pullman strike and at the time, rightly or wrongly, the strikers were blamed.
  • You can see in the background, the TROCADERO with two towers .. This “medieval city” was built on the right bank where are now buildings in modern art style erected for the exposition of 1937.

Brooklyn Museum pulled over 48 tags from Flickr before they deleted the account. That’s just 0.7% of the tags that were there. None of the comments or notes were moved over.

In the data that Aaron archived there was one indicator of user engagement: the datetime included with comments. Combined with the upload time for the images it was possible to create a spreadsheet that correlates the number of comments with the number of uploads per month:

Brooklyn Museum Flickr Activity

I’m guessing the drop off in December of 2013 is due to that being the last time Aaron archived Brooklyn Museum’s metadata. You can see that there was a decline in user engagement: the peak in late 2008 / early 2009 was never matched again. I was half expecting to see that user engagement fell off when Brooklyn Museum’s interest in the platform (uploads) fell off. But you can see that they continued to push content to Flickr, without seeing much of a reward, at least in the shape of comments. It’s impossible now to tell if tagging, notes or sets trended differently.

Since Flickr includes the number of times each image was viewed it’s possible to look at all the images and see how many times images were viewed, the answer?


Not a bad run for 5,697 images. I don’t know if Brooklyn Museum downloaded their metadata prior to removing their account. But luckily Aaron did.

Glass Houses

You may have noticed Brooklyn Museum’s recent announcement that they have pulled out of Flickr Commons. Apparently they’ve seen a “steady decline in engagement level” on Flickr, and decided to remove their content from that platform, so they can focus on their own website as well as Wikimedia Commons.

Brooklyn Museum announced three years ago that they would be cross-posting their content to Internet Archive and Wikimedia Commons. Perhaps I’m not seeing their current bot, but they appear to have two, neither of which have done an upload since March of 2011, based on their user activity. It’s kind of ironic that content like this was uploaded to Wikimedia Commons by Flickr Uploader Bot and not by one of their own bots.

The announcement stirred up a fair bit of discussion about how an institution devoted to the preservation and curation of cultural heritage material could delete all the curation that has happened at Flickr. The theory being that all the comments, tagging and annotation that has happened on Flickr has not been migrated to Wikimedia Commons. I’m not even sure if there’s a place where this structured data could live at Wikimedia Commons. Perhaps some sort of template could be created, or it could live in Wikidata?

Fortunately, Aaron Straup-Cope has a backup copy of Flickr Commons metadata, which includes a snapshot of the Brooklyn Museum’s content. He’s been harvesting this metadata out of concern for Flickr’s future, but surprise, surprise — it was an organization devoted to preservation of cultural heritage material that removed it. It would be interesting to see how many comments there were. I’m currently unpacking a tarball of Aaron’s metadata on an ec2 instance just to see if it’s easy to summarize.


I’m pretty sure I’m living in one of those.

I agree with Ben:

It would help if we had a bit more method to the madness of our own Web presence. Too often the Web is treated as a marketing platform instead of our culture’s predominant content delivery mechanism. Brooklyn Museum deserves a lot of credit for talking about this issue openly. Most organizations just sweep it under the carpet and hope nobody notices.

What do you think? Is it acceptable that Brooklyn Museum discarded the user contributions that happened on Flickr, and that all the people who happened to be pointing at said content from elsewhere now have broken links? Could Brooklyn Museum instead decided to leave the content there, with a banner of some kind indicating that it is no longer actively maintained? Don’t lots of copies keep stuff safe?

Or perhaps having too many copies detracts from the perceived value of the currently endorsed places of finding the content? Curators have too many places to look, which aren’t synchronized, which add confusion and duplication. Maybe it’s better to have one place where people can focus their attention?

Perhaps these two positions aren’t at odds, and what’s actually at issue is a framework for thinking about how to migrate Web content between platforms. And different expectations about content that is self hosted, and content that is hosted elsewhere?