pypi over xmlrpc

It’s great to see that our ChiPy sprint bore some fruit for the PyPI service. There’s now decent XMLRPC support in PyPI for querying the packages. This will hopefully open up the door for lots of PyPI utilities that abound in the Perl/CPAN world…like this very simple client for listing packages:

#!/usr/bin/env python

pascal's triangle in python

I mentioned Pascal’s Triangle in the previous post, and after typing in the Oz code decided to make a Pascal’s Triangle pretty printer in python.

from sys import argv

def pascal(n):
    if n == 1:
        return [ [1] ]
        result = pascal(n-1)
        lastRow = result[-1]
        result.append( [ (a+b) for a,b in zip([0]+lastRow, lastRow+[0]) ] )
        return result

def pretty(tree):
    if len(tree) == 0: return ''
    line = '  ' * len(tree)
    for cell in tree[0]:
        line += '  %2i' % cell
    return line + "\n" + pretty(tree[1:])

if __name__ == '__main__':
    print pretty( pascal( int(argv[1]) ) ) 

Which, when run with can generate something like this:

biblio:~/Projects/bookclub ed$ python 9
                   1   1
                 1   2   1
               1   3   3   1
             1   4   6   4   1
           1   5  10  10   5   1
         1   6  15  20  15   6   1
       1   7  21  35  35  21   7   1
     1   8  28  56  70  56  28   8   1 

It’s been fun reading up on the uses for Pascal’s triangle, although I imagine this is old hat for people more familiar with math than I. Still I think getting through this tome will be time well spent in the long run.

chipy bookclub

So the Chicago Python Group has started up a bookclub about a month ago. The first book we’re reading as a group is Concepts, Techniques, and Models of Computer Programming which is fortunately available online for free. The aim of the bookclub (as with many bookclubs) is to work through a text together, and hopefully get to hear different perspectives during discussion which will happen online and after our monthly meetings. Also, a bit of peer pressure can help make it through certain types of books…

And this first book is a doozy at 939 pages. It covers all sorts of territory from computer science using a multi-paradigm language called Oz. I’ve made it through the preface, and into Chapter 1, which starts out teaching some fundamental concepts behind functional programming. The jury is still out, but so far I’m finding the content refreshingly clear and stimulating. I like the fact that mathematical notation (so far) is explained and not taken for granted. Calculating factorial with recursion is a bit predictable, but chapter 1 quickly moved on to an algorithm that calculates a given row in a Pascal’s Triangle.

The sheer magnitude of the book is a bit intimdating, however reading it on my ibook makes it easy to ignore that. I’m thinking of it as a sort of thematic encyclopedia of computer programming with handy illustrations. Hopefully I’ll find the time to drop my thoughts here as I work my way through each chapter. Please feel free to join us (whether you’re from Chicago or not) if you are interested.

Update to SRU and CQL::Parser

If you are tracking it you might be interested to know that Brian Cassidy added a Catalyst plugin to the SRU CPAN module. Catalyst is a MVC framework that is getting quite a bit of mindshare in the Perl community (at least the small subset I hang out with in #code4lib). And if that wasn’t enough Brian also committed some changes to CQL::Parser that provides toLucene() functionality for converting CQL queries to queries that can be passed off to Lucene. Thanks Brian!

Net::OAI::Harvester v1.0

I got an email from Thorsten Schwander at LANL about a bug in Net::OAI::Harvester when using a custom metadata handler with the auto-resumption token handling code. This was the first I’d heard about anyone using the custom metadata handling feature in N:O:H, so I was pleased to hear about it. Thorsten was kind enough to send a patch, so a new version is on its way around the CPAN mirrors. While it’s hardly a major change, this is bumping the version from 0.991 to 1.0. It’s been over 2 years since N:O:H was first released, and it’s been pretty stable for the past year.

google's map api

Adrian pointed out at the last chipy meeting that a formal API for GoogleMaps was in the works…but I had no idea it was this close.

After you’ve got an authentication key for your site directory, all you need to do to embed a map in your page is include a javascript library source URL directly from google, create a <div> tag with an id (say “map”) and add some javascript to your page.

    var map = new GMap( document.getElementById("map") );
    map.addControl( new GSmallMapControl() );
    map.centerAndZoom( new GPoint(-88.316385,42.247090), 4);                                                                    

This took literally 2 minutes to do, if that. It’s a bit tedious that the token is only good on a per-directory basis, but I guess this is because of partitioned blogging sites where different users have different directories with the same hostname.

update: I guess I’m not the only one who finds the per-directory limit to be kind of a hassle.


Well, I took the plunge and installed the latest version of OS X. I’m actually posting this blog entry with a WordPress dashboard plugin. I backed up my mail, addressbook and calendar and did a clean install. I was a bit nervous that I forgot to back up everything I needed…but it was also kind of refreshing starting with a clean slate. I’ve got the latest versions of Perl and Python building in the background now, and everything so far seems pretty smooth. I hope to take a closer look at dashboard widgets sometime soon.

lightning strikes

Chris has a nice writeup about last nights ChiPy lightning talks. There were tons of interesting people there with very interesting projects. Apart from the announcement that we might be hosting PyCon next year in Chicago, the highlight of the evening for me was hearing about the amazing data hack that is ChicagoCrime. Adrian is a journalist/programmer who managed to glue together GoogleMaps with publicly available data from the Chicago Police Department. The main (perhaps unintended) things I took from his enthusiastic and humorous talk were:

  • screen scraping is fragile but it’s an important lever for fostering more elegant/robust information sharing.
  • screen scraping is fragile but it’s important for building new public applications that aren’t run behind closed doors at the Department of Homeland Security

I really, really want to get going on the GovTrack data scraping now.


GovTrack has done some awesome work generating publicly available machine readable data for US government information. After the last election I decided that I really wanted to get involved in some sort of volunteer technology/political activity, so I started googling and found GovTrack pretty much just starting up. Now there is a loose affiliation of similar sites (including GovTrack) called Ogdex who are attempting to foster the collection of publicly available government information. In particular there has been some talk on the govtrack discussion list about local efforts to add state data to the collection of federal data…and even bounties for getting state data collection going. I’m going to take a stab at writing some scraping utilities for gathering together Illinois data and will report back with how it goes. If you are interested in helping out details are available.

Update: Joshua just set up a new drupal site for govtrack development.


I’m going to be doing a lightning talk tonight at the Chicago Python Group about pylucene. pylucene essentially lets you use the popular Lucene indexing library (Java) in Python. No time limit has been set for the lightning talks (and mjd won’t be there with his gong) but I hope to quickly cover how to index an mbox with pylucene in 5 minutes. There are slides, which are there mainly as cue cards.