Archive for the ‘Uncategorized’ Category

perl is my mood ring

Wednesday, May 21st, 2008

Every day for the past 8 years (give or take), cron has run a little script to change my Desktop background image to the astronomy picture of the day.

I logged in today, and this is what I got:

desktop 2008-05-21

I realize it’s Gliese 876d, but I took it as a statement about the current state of my psyche. Some days are just like that…

WeibelNumber 3

Sunday, February 12th, 2006

So I have succumbed to infection (thanks Ross), but I’m not entirely sure how this is supposed to work…and Kesa said “Don’t Do It!”. After reading Morbus rail about it I’m almost afraid that I won’t be able to lurk on #swhack anymore if I do spread this any further. But it’s already gutted the interwebs pretty well by now, so here it goes anyhow. I’m just pleased to have a WeibelNumber of 3.

Jobs I’ve had:
- pizza delivery man
- food coop worker
- used bookstore employee
- newspaper kiosk attendant

Movies I can watch over and over:
- Brazil
- Finding Nemo
- Eternal Sunshine of the Spotless Mind
- Lord of the Rings

TV shows I love to watch:
- Jim Lehrer News Hour
- Mystery
- Nova
- Frontline

Places I have lived:
- Princeton Jct, NJ
- New York, New York
- Brighton, England
- Urbana, IL

Places I have been on holiday:
- Cinque Terre
- Montreal
- Kalymnos
- Amsterdam

4 of my favorite dishes:
- Italian Wedding Soup
- Any kind of curry
- Cracklin’ Oat Bran
- Oranges in winter

4 books (just 4 that I’m currently reading–favorite pshaw!)
- Agile Web Development with Rails
- A Mathematical Mystery Tour: Discovering the Truth and Beauty of the Cosmos
- The Algebraist
- Data Mining

Websites I visit daily:
- gmail
- google
- delicious
- unalog

Places I would rather be right now:
- at home with kesa and chloe
- on vacation
- outside of the United States of America
- outside of the United States of America

Bloggers I am tagging (and hope that they miss this entry):
- Brian Cassidy
- Jason Gessner
- Mark Jordan
- Jeff Barry

testing, testing, is this thing on?

Thursday, January 26th, 2006

Marcel Molina on testing in Rails

Begins: Mon, 06 Feb 2006 at 6:30 PM

Ends: Mon, 06 Feb 2006 at 10:00 PM

Location:

Thoughtworks

651 W. Washington Blvd

Chicago, IL 60661

USA

Link: meetup page

I’m looking forward to attending this talk by Marcel Molina of 37Signals on testing in Rails. One of the things that has impressed me the most about Rails so far is how test stubs are automatically written out for you when you generate classes. I also really like how fixtures can load test data for each test…and the custom assertions rock. Anyhow, hopefully I will find the time to attend this the week before I head out to Oregon.

search @ delicious and the bbc

Wednesday, November 9th, 2005

I just noticed that del.icio.us now has full, fast search across all content (not just your own bookmarks). This is something that Dan’s unalog has had on delicious for a while (apart from the delightful content). Dan uses pylucene as his search engine, which still has some interesting features. It’s pretty wild being able to search across all the delicious content, given their volume.

When delicious was really ramping up I saw the occasional mason error page, so I know that they are (or were) using Perl. This makes me really curious to know what search technology they are using…but I couldn’t find any details in the announcement.

Likewise, the news about the BBC Programme Catalogue being built with RubyOnRails. I’ve really come to appreciate Lucene and PyLucene and am in search of similar search tools for Ruby. I’ve got an email out to Matt Biddulph to see if he can provide any details about the BBC effort.

File under m for megalomania

Thursday, September 1st, 2005

Google Announces Plan To Destroy All Information It Can’t Index.

Although Google executives are keeping many details about Google Purge under wraps, some analysts speculate that the categories of information Google will eventually index or destroy include handwritten correspondence, buried fossils, and private thoughts and feelings.

Seriously, many a truth is said in jest. With the news that Google is going to be selling another 4 billion dollars worth of shares it makes sense that they would be thinking of a purging program to balance out their binging. What are they going to do with 4 billion dollars? I can’t even begin to imagine. It is frankly, a bit frightening, and seems like behavior one might read about in the DSM IV.

MARC::Record v2.0 RC1

Friday, May 20th, 2005

Thanks to the support of Anne Highsmith at Texas A&M MARC::Record v2.0 RC1 was released today to sourceforge. This new version of MARC::Record addresses the use of Unicode in MARC records. There has been a long standing bug in MARC::Record which caused it to calculate record directories incorrectly when the records contained Unicode. This isn’t hitting CPAN yet so that the people who want Unicode handling can take it for a test drive first. As noted previously this Perl/Unicode stuff is pretty tricky since most of the time the encoding of a scalar variable is sort of hidden from view. I’d much prefer to be in a situation like in Java where all strings are UTF-8.

MARC, Perl and Unicode

Thursday, May 5th, 2005

I’ve been doing some work for Texas A&M who need a MARC::Record module that is Unicode safe. Many ILS vendors are moving away from MARC-8 encoded records towards Unicode. No doubt this move is being spurred on by big players like OCLC who are moving (or have moved) their mammoth WorldCat database to Unicode.

At any rate Texas A&M have workflows that use MARC::Record for transforming records in their catalog and they need the Unicode support for their new Voyager system. Technically there were very few places where MARC::Record needed to be adjusted. The problem is that the antiquated transmission format for MARC records uses byte lengths in the so called directory, as offsets into the record. MARC::Record uses length() and substr() to create and work with the directory…which works fine when 1 character equals 1 byte. However, Unicode characters can have multiple bytes per character…so the character oriented length() will create faulty record directories, and substr() will extract data from the rest of the record incorrectly.

Fortunately there is the bytes pragma which alters the behavior of various character oriented Perl functions. Unfortunately these functions were added to Perl relatively recently, so this new version of MARC::Record will require Perl >= v5.8.2. Technically it could run on 5.8.1, however I found that the 5.8.1 that ships with OS X 10.3 lacks the bytes::substr(). Not only that but if you try to call a non existent function in the bytes namespace you’ll go into an infinite loop. This is even the case with Perl 5.8.6 as well.

All in all I really have come to dislike Perl’s Unicode support. The magical utf8 flag on scalars has a tendency to pop on and off for obscure reasons. And I’ve found the behavior of bytes::length() to be a bit unpredictable. Surely this is because I don’t fully understand the mechanics involved, but judging from the traffic on perl-unicode I’m not the only one who has struggled with it. My experience using unicode in Java and Python has been much more pleasant, and really confirms my decision to move towards doing new work in these languages. Perl has served me well, and there are some things I really love about the language, but these nasty corners are a bit scary.

Hello WordPress

Sunday, April 24th, 2005

Hello WordPress, bye bye custom blog code written in Perl. Well the old code is still running, but I’ve wanted to install WordPress for the past few months and finally got around to it this weekend. I had a little bit of trouble getting PHP installed, only because I decided to use the older php4 with the latest mysql, and php4 didn’t seem to want to configure itself using the latest mysql. Fortunately using php5 was a different story and WordPress was a breeze to install.

My reasons for switching from my homegrown code to WordPress are several.

  • there was really no way of commenting on stories, only adding them.
  • the old code didn’t really archive or categorize stories the way I wanted to
  • links to stories didn’t work, and I wanted to join dan’s Planet #code4lib.
  • I didn’t use the RSS aggregation features I wrote since I started using Bloglines.
  • I’ve been coding more in Python these days and don’t feel particularly tied to my Perl code base any longer. WordPress is PHP, which I’m not a huge fan of, but I think this had more to do with the PHP that I was exposed to more than the language itself. Installing WordPress and the various plugins like the audioscrobbler one you see to the right was very pleasant.
  • the WordPress community is extremely rich. I spent some time with Kesa looking at different themes, but in the end decided to stay with the default for now. There are tons of neat plugins to look at.

So what you can expect here is more of the same. I’m going to try to write more about my work as a programmer, mainly as a journal for myself to keep track of what I’m working on, where I’ve been, and where I’d like to go. Perhaps you are thinking spare me the details, where are the pictures of Chloe?! If this is the case you should see a link to the photos over on the right.