I write software for libraries and archives. I try to live up to at least the former in Release Early, Release Often, so most of my code musings can be found on GitHub. The idea is things incubate there, and if they ever become anything more than a plaything they migrate to a place like CPAN, RubyForge, Python Cheeseshop, etc…
Here are some things I’ve worked on in the past that are on GitHub (courtesy of Kenny Katzgrau’s handy GitHub/BitBucket Project Lister for WordPress):
-
wikistream
displays edit activity on wikipedia (156 watchers)
-
pymarc
process MARC records from Python (51 watchers)
-
microdata
python library for extracting html5 microdata (25 watchers)
-
linkypedia
a web based tool to monitor how your website content is used in wikipedia (24 watchers)
-
rdflib-microdata
an rdflib plugin to parse html5 microdata (21 watchers)
-
bagit
create BagIt style packages of digital content (21 watchers)
-
ruby-oai
a Ruby library for building OAI-PMH clients and servers (15 watchers)
-
dflat
an implementation of the dflat and redd specifications from CDL for versioning of digital objects (15 watchers)
-
opensearch
A python opensearch client (13 watchers)
-
wikipulse
a gauge widget to display wikipedia activity (11 watchers)
-
lod-graph
A protovis visualization of the linked open data cloud. (10 watchers)
-
ptree
minimal PairTree implementation (9 watchers)
-
empirical-cloud
a little demo visualization of owl:sameAs links in billion triple challenge data (8 watchers)
-
nytimestream
NYTimes Newswire API as a stream using node.js (7 watchers)
-
geonames-localsolr
A little project to help bootstrap a local-solr instance with geonames data. (7 watchers)
-
dev8d-linked-data
some experiments with linked data available from the dev8d conference (6 watchers)
-
wikitweets
see tweets that reference wikipedia articles (6 watchers)
-
www-wikipedia
Simple Perl client for grabbing content out of Wikipedia (6 watchers)
-
dewey-crawler
simplistic crawler and serializer for linked data at dewey.info (5 watchers)
-
lcsh-subset
create a subset view of LCSH (5 watchers)
-
wikitrends
see most viewed wikipedia articles (5 watchers)
-
paperbot
twitter bot for Chronicling America (5 watchers)
-
chronam-widget
view on NDNP content using just HTML/JavaScript and the Chronicling America API (5 watchers)
-
cql-parser
A Perl module for working with the Common Query Language (4 watchers)
-
skosdict
turn a SKOS concept scheme into a simple JSON dictionary (4 watchers)
-
libweb
extract library homepage urls from LIBWEB (4 watchers)
-
bisac
top level BISAC subject vocabulary (4 watchers)
-
lochief
A linked-data version of kochief (4 watchers)
-
europeana-crawler
a simple crawler of the RDFa in Europeana (4 watchers)
-
data-gov-uk-harvester
tiny little project to harvest rdfa metadata from data.gov.uk (3 watchers)
-
lcsh-index
a simple example of putting lcsh into an solr index (3 watchers)
-
ocropy
minimalist wrapper around ocropus for generating hOCR documents from images (3 watchers)
-
restful-bag-server
Draft of proposed structure for serving BagIt repositories RESTfully (3 watchers)
-
sru-perl
A Perl module for interacting with Search and Retrieve by URL servers (3 watchers)
-
subjects-here
An HTML5 experiment that uses OCLC's mapFast to lookup subjects for your current location. (3 watchers)
-
oai2pairtree
command line utility to dump records in an oai-pmh repository as xml in a pairtree (3 watchers)
-
namaste
Python port of the Namaste Perl module, "which implements the Namaste (Name as Text) convention for containing a data element completely within the content of a file, using as filename an approximation of the value preceded by a numeric tag." (3 watchers)
-
google-count
hack to count google hits (3 watchers)
-
pairtree
Python Pairtree implementation (3 watchers)
-
worksvenn
generate a Venn diagram for LibraryThing, OCLC and OpenLibrary FRBRization services (3 watchers)
-
lc-findingaids
(3 watchers)
-
databib-metadata
example html/metadata examples for databib (2 watchers)
-
django-sugar
Curated collection of all the sweet Django helpers/utilities developers create, and sometimes recreate too often. (2 watchers)
-
twitterator
iterator functions for twitter api (2 watchers)
-
mediatypes
A project that harvests media type information from the IANA registry, and publishes information as linked data using the Google App Engine. (2 watchers)
-
oai2xmpp
oai-pmh -> xmpp (2 watchers)
-
versioning-metrics
little utility to compare approaches to version control (2 watchers)
-
ohh
share content, have fun, make friends (2 watchers)
-
lldvis
LLD Visualiser (1 watcher)
-
redis
Redis key-value store (1 watcher)
-
webarchives
see if a URL is available in a web archive somewhere on the web (1 watcher)
-
marc-detrans
Perl de-transliteration engine for converting romanized text in bibliographic data to native scripts. (1 watcher)
-
bagit-ruby
Ruby Library and Command Line tools for bagit (1 watcher)
-
marc-subjectmap
perl framework for translating subject headings in MARC data (1 watcher)
-
Socket.IO
Sockets for the rest of us (1 watcher)
-
muldicat
tool to generate SKOS for the Multilingual Dictionary of Cataloging Terms and Concepts (1 watcher)
-
django-pagination
A set of utilities for creating robust pagination tools throughout a django application. (1 watcher)
-
beat
little experiment to look at links in LC bibliographic data (1 watcher)
-
lcco
Converts a textual representation of the Library of Congress Classification Outline into SKOS/RDF and makes it available on the Web in a hierarchical viewer. (1 watcher)
-
bootstrap
CSS toolkit from Twitter (1 watcher)
-
toxic-bags
a collection of BagIt test data (1 watcher)
-
south-test
just a throw away demo app (1 watcher)
-
requests
Python HTTP Requests for Humans. (1 watcher)
-
NativeImaging
Experimental PIL-like interface for basic functionality using platform native libraries such as GraphicsMagick (1 watcher)
-
python-sitemap
Python library for parsing & generating sitemaps (1 watcher)
-
alto-words
simplistic calculation of the ratio of dictionary words to all words in a METS Alto OCR file (1 watcher)
-
django-tastypie
Creating delicious APIs for Django apps since 2010. v1.0.0-beta (1 watcher)
-
fastcat
navigate wikipedia categories quickly in a local redis instance (1 watcher)
-
python-oauth2
A fully tested, abstract interface to creating OAuth clients and servers. (1 watcher)
-
microdata_schemaorg_example
Step by step example of applying Microdata and Schema.org vocabularies to a digital collections site. (1 watcher)
-
aotycmp
hack to see what well reviewed albums-of-the-year are available on Spotify and Rdio (1 watcher)
-
wordpressure
realtime view of new items posted to WordPress sites (1 watcher)
-
id
LCSH SKOS webapp (1 watcher)
-
ckanext-storage
CKAN storage extension. (1 watcher)
-
inkdroid-proxy
my node.js proxy server (1 watcher)
-
inkdroid-apache
my config files for apache (1 watcher)
-
collection
Cooper-Hewitt's Collection Database (1 watcher)
-
node
evented I/O for v8 javascript (1 watcher)
-
semantictweet
A simple Sinatra application that provides a FOAF semantic web feed of your twitter friends and followers. Forked from sinatra-template. (1 watcher)