Software

I write software for libraries and archives. I try to live up to at least the former in Release Early, Release Often, so most of my code musings can be found on GitHub. The idea is things incubate there, and if they ever become anything more than a plaything they migrate to a place like CPAN, RubyForge, Python Cheeseshop, etc…

Here are some things I’ve worked on in the past that are on GitHub (courtesy of Kenny Katzgrau’s handy GitHub/BitBucket Project Lister for WordPress):

  • wikistream

    displays edit activity on wikipedia (156 watchers)

  • pymarc

    process MARC records from Python (51 watchers)

  • microdata

    python library for extracting html5 microdata (25 watchers)

  • linkypedia

    a web based tool to monitor how your website content is used in wikipedia (24 watchers)

  • rdflib-microdata

    an rdflib plugin to parse html5 microdata (21 watchers)

  • bagit

    create BagIt style packages of digital content (21 watchers)

  • ruby-oai

    a Ruby library for building OAI-PMH clients and servers (15 watchers)

  • dflat

    an implementation of the dflat and redd specifications from CDL for versioning of digital objects (15 watchers)

  • opensearch

    A python opensearch client (13 watchers)

  • wikipulse

    a gauge widget to display wikipedia activity (11 watchers)

  • lod-graph

    A protovis visualization of the linked open data cloud. (10 watchers)

  • ptree

    minimal PairTree implementation (9 watchers)

  • empirical-cloud

    a little demo visualization of owl:sameAs links in billion triple challenge data (8 watchers)

  • nytimestream

    NYTimes Newswire API as a stream using node.js (7 watchers)

  • geonames-localsolr

    A little project to help bootstrap a local-solr instance with geonames data. (7 watchers)

  • dev8d-linked-data

    some experiments with linked data available from the dev8d conference (6 watchers)

  • wikitweets

    see tweets that reference wikipedia articles (6 watchers)

  • www-wikipedia

    Simple Perl client for grabbing content out of Wikipedia (6 watchers)

  • dewey-crawler

    simplistic crawler and serializer for linked data at dewey.info (5 watchers)

  • lcsh-subset

    create a subset view of LCSH (5 watchers)

  • wikitrends

    see most viewed wikipedia articles (5 watchers)

  • paperbot

    twitter bot for Chronicling America (5 watchers)

  • chronam-widget

    view on NDNP content using just HTML/JavaScript and the Chronicling America API (5 watchers)

  • cql-parser

    A Perl module for working with the Common Query Language (4 watchers)

  • skosdict

    turn a SKOS concept scheme into a simple JSON dictionary (4 watchers)

  • libweb

    extract library homepage urls from LIBWEB (4 watchers)

  • bisac

    top level BISAC subject vocabulary (4 watchers)

  • lochief

    A linked-data version of kochief (4 watchers)

  • europeana-crawler

    a simple crawler of the RDFa in Europeana (4 watchers)

  • data-gov-uk-harvester

    tiny little project to harvest rdfa metadata from data.gov.uk (3 watchers)

  • lcsh-index

    a simple example of putting lcsh into an solr index (3 watchers)

  • ocropy

    minimalist wrapper around ocropus for generating hOCR documents from images (3 watchers)

  • restful-bag-server

    Draft of proposed structure for serving BagIt repositories RESTfully (3 watchers)

  • sru-perl

    A Perl module for interacting with Search and Retrieve by URL servers (3 watchers)

  • subjects-here

    An HTML5 experiment that uses OCLC's mapFast to lookup subjects for your current location. (3 watchers)

  • oai2pairtree

    command line utility to dump records in an oai-pmh repository as xml in a pairtree (3 watchers)

  • namaste

    Python port of the Namaste Perl module, "which implements the Namaste (Name as Text) convention for containing a data element completely within the content of a file, using as filename an approximation of the value preceded by a numeric tag." (3 watchers)

  • google-count

    hack to count google hits (3 watchers)

  • pairtree

    Python Pairtree implementation (3 watchers)

  • worksvenn

    generate a Venn diagram for LibraryThing, OCLC and OpenLibrary FRBRization services (3 watchers)

  • lc-findingaids

    (3 watchers)

  • databib-metadata

    example html/metadata examples for databib (2 watchers)

  • django-sugar

    Curated collection of all the sweet Django helpers/utilities developers create, and sometimes recreate too often. (2 watchers)

  • twitterator

    iterator functions for twitter api (2 watchers)

  • mediatypes

    A project that harvests media type information from the IANA registry, and publishes information as linked data using the Google App Engine. (2 watchers)

  • oai2xmpp

    oai-pmh -> xmpp (2 watchers)

  • versioning-metrics

    little utility to compare approaches to version control (2 watchers)

  • ohh

    share content, have fun, make friends (2 watchers)

  • lldvis

    LLD Visualiser (1 watcher)

  • redis

    Redis key-value store (1 watcher)

  • webarchives

    see if a URL is available in a web archive somewhere on the web (1 watcher)

  • marc-detrans

    Perl de-transliteration engine for converting romanized text in bibliographic data to native scripts. (1 watcher)

  • bagit-ruby

    Ruby Library and Command Line tools for bagit (1 watcher)

  • marc-subjectmap

    perl framework for translating subject headings in MARC data (1 watcher)

  • Socket.IO

    Sockets for the rest of us (1 watcher)

  • muldicat

    tool to generate SKOS for the Multilingual Dictionary of Cataloging Terms and Concepts (1 watcher)

  • django-pagination

    A set of utilities for creating robust pagination tools throughout a django application. (1 watcher)

  • beat

    little experiment to look at links in LC bibliographic data (1 watcher)

  • lcco

    Converts a textual representation of the Library of Congress Classification Outline into SKOS/RDF and makes it available on the Web in a hierarchical viewer. (1 watcher)

  • bootstrap

    CSS toolkit from Twitter (1 watcher)

  • toxic-bags

    a collection of BagIt test data (1 watcher)

  • south-test

    just a throw away demo app (1 watcher)

  • requests

    Python HTTP Requests for Humans. (1 watcher)

  • NativeImaging

    Experimental PIL-like interface for basic functionality using platform native libraries such as GraphicsMagick (1 watcher)

  • python-sitemap

    Python library for parsing & generating sitemaps (1 watcher)

  • alto-words

    simplistic calculation of the ratio of dictionary words to all words in a METS Alto OCR file (1 watcher)

  • django-tastypie

    Creating delicious APIs for Django apps since 2010. v1.0.0-beta (1 watcher)

  • fastcat

    navigate wikipedia categories quickly in a local redis instance (1 watcher)

  • python-oauth2

    A fully tested, abstract interface to creating OAuth clients and servers. (1 watcher)

  • microdata_schemaorg_example

    Step by step example of applying Microdata and Schema.org vocabularies to a digital collections site. (1 watcher)

  • aotycmp

    hack to see what well reviewed albums-of-the-year are available on Spotify and Rdio (1 watcher)

  • wordpressure

    realtime view of new items posted to WordPress sites (1 watcher)

  • id

    LCSH SKOS webapp (1 watcher)

  • ckanext-storage

    CKAN storage extension. (1 watcher)

  • inkdroid-proxy

    my node.js proxy server (1 watcher)

  • inkdroid-apache

    my config files for apache (1 watcher)

  • collection

    Cooper-Hewitt's Collection Database (1 watcher)

  • node

    evented I/O for v8 javascript (1 watcher)

  • semantictweet

    A simple Sinatra application that provides a FOAF semantic web feed of your twitter friends and followers. Forked from sinatra-template. (1 watcher)

Leave a Reply