testing, testing, is this thing on?

Marcel Molina on testing in Rails

Begins: Mon, 06 Feb 2006 at 6:30 PM

Ends: Mon, 06 Feb 2006 at 10:00 PM



651 W. Washington Blvd

Chicago, IL 60661


Link: meetup page

I'm looking forward to attending this talk by Marcel Molina of 37Signals on testing in Rails. One of the things that has impressed me the most about Rails so far is how test stubs are automatically written out for you when you generate classes. I also really like how fixtures can load test data for each test...and the custom assertions rock. Anyhow, hopefully I will find the time to attend this the week before I head out to Oregon.

ical and outlook

I got to talking to Brian Suda about why his hCalendar extracting application x2v works like a dream with iCal but doesn’t seem to work with Microsoft Outlook 2002.

vCalendar/iCalendar Import failed. The input file may be corrupt.

Here’s the event that Outlook doesn’t like, but iCal does:

PRODID:-//suda.co.uk//X2V 0.6.7 (BETA)//EN
X-ORIGINAL-URL: http://www.code4lib.org/
SUMMARY;LANGUAGE=en:The Portland Jazz Festival

After quite a bit of experimentation we determined that Outlook demands that the METHOD, UID and DTSTAMP fields be defined.

PRODID:-//suda.co.uk//X2V 0.6.7 (BETA)//EN
X-ORIGINAL-URL: http://www.code4lib.org/
SUMMARY;LANGUAGE=en:The Portland Jazz Festival

Just thought I’d mention it in here in case someone ends up googling for that error. Brian said he’s going to look into providing this support for those of us who have Microsoft Outlook inflicted on us.

openurl as microformat

The Search

Author: John Battelle

Year: 2005

Publisher: Portfolio Hardcover

ISBN: 1591840880

Ok, so The Search is a great book so far...but I'm really just testing some local modifications I made to the structured blogging tool to use Book OpenURL KEV parameter names as a microformat. Take a look in the HTML and you should see them hiding there. Here's a somewhat prettified version as an image since I couldn't get my syntax highlighter plugin to do a nice enough job with the HTML. Pretty simple stuff right? Notice the COinS in there too? That's thanks to Dan's hacking at structured blogging. Actually getting openurl KEV support into structured blogging is another idea of Dan's. Go Chudnov. Update 01/19/2006 09:39 CST: Dan got similar support for journal articles. If this stuff caught on it could really revolutionize academic blogging...and more.


Thanks to a ping from Dan I just finished listening to an inteview with David Sivers of CDBaby over on Venture Voice. Sivers talks about how being a musician and working briefly in the music industry informed his decision to build CDBaby.

The interview spans a ton of subjects from what’s wrong with the music industry and what to do about it; how being in a circus informs your business acumen; why rejecting venture capital can help you focus on what really matters; and how a lot of heart and hubris can get you really far. Really refreshing (and funny) stuff.

CDBaby is now the single largest digital catalog in the world, and provides a gateway for independent musicians into distribution chains like iTunes. They started with three people sharing a 56k modem in Woodstock, NY. What a fun story.

code4lib conference shaping up

The votes are in, and the a tentative schedule is up. There were a remarkable amount of wonderful presentation ideas submitted, and unfortunately there wasn’t the time/space for all of them. Fortunately there will be lightning talks and breakout sessions that will hopefully pick up some of the slack.

The talks were voted on by anyone who planned on attending. That’s right anyone. This was like a breath of fresh air for me. The voting mechanism was a genius javascript hack at the 11th hour by Ross Singer which allowed drupal users on code4lib.org to annotate a backpack page, which stored results in a database at gatech.edu. We even hooked up our resident bot in #code4lib to be able to talk to the database and get up to the minute polling results.

Anyhow, things are looking really good for the conference. If you were waiting for the presentation to firm up before registering take a look at the schedule. And if you need anymore convincing checkout Lorcan Dempsey’s blog which says it all.

good fences and the frankenweb

Ian Bicking has some interesting notes about competing web development technologies–mainly in response to some posts from Ivan Krystic. The discussion is definitely recommended, especially if you find yourself looking at web application frameworks for Python and Ruby. I found the pivot point of the discussion to be around a new term (for me) – the “frankenweb”.

My understanding is that like Frankenstein (a being created by stitching together random body parts from dead humans) the frankenweb is an unholy mixture of MVC components pulled from different projects, when put together result in an ugly partially functional whole. I think this characterization of Ian’s work is really unpleasant, but strangely compelling. I think that this is mainly because of Ian’s response:

The “Frankenweb” is a feature, and it describes the web we have, the software we have, and the future that is inevitable. The world was never all J2EE, or ASP(.NET), or PHP, and it won’t be all Rails either.

I think Ian is right on about this: “frankenweb” does describe the web we have, and hopefully the web we will continue to have–and the degree to which we can all interoperate is the degree to which the web will succeed. Perhaps I’m seeing the frankenweb through Weinberger-Colored-Glasses, having just finished Small Pieces Loosely Joined (which I thoroughly enjoyed and plan to write about later if there is time). Weinberger does an excellent job of distilling the essence of the web, and how its architecture enabled it to pull itself up from it’s own bootstraps, grow and adapt:

In the real world, I can’t just put in a door from my apartment to my neighbor’s so that anyone can go through. But that’s exactly how the web was built. Tim Berners-Lee orginally created the web so that scientists could link to the work of other scientists without having to ask their permission. If I put a page into the public Web, you can link to it without having to ask to do anything special, without asking me if it’s alright with me, and without even letting me know that you’ve done it…The web couldn’t have been built if everyone had to ask permission first.

Of course I’m conflating links between pages, and API links between software components…but what Ian says about embracing the frankenweb seems to resonate with this somehow.

It’s also quite disorienting to hear Ivan and others lauding tight coupling:

You don’t see the Ruby on Rails guys modularizing Rails to the point of pain. You see them delivering a single, high-polish, tightly coupled product that does its job well.

Given the various pluggable modules that make up Rails I think “tightly coupled” is largely an overstatement. Granted they are available in the same code base, and I haven’t tried to use one of them in isolation–but I imagine it could be done if someone wanted to say, use a activerecord model in a script or something. The Pragmatic Programmer has a really nice chapter on decoupling, and the authors are actually heavily involved in the Ruby/Rails community. The chapter starts out with a nice quote from Robert Frost’s poem The Mending Wall:

Good fences make good neighbors.

It seems to me that Ian is doing the hard work of patching some of these fences, and building a few and deserves a lot of credit for the effort and cat herding.

Fear Itself

I’m glad I’m not the only one who was immediately reminded of this when the NSA spying story broke.

If you are interested in the perspective of a computer security specialist definitely take a look at what Bruce Scheneier has been writing. Schneier’s theory on why Bush needed to bypass the Foreign Intelligence Security Court is pretty harrowing.

The NSA’s ability to eavesdrop on communications is exemplified by a technological capability called Echelon. Echelon is the world’s largest information “vacuum cleaner,” sucking up a staggering amount of voice, fax, and data communications – satellite, microwave, fiber-optic, cellular and everything else – from all over the world: an estimated 3 billion communications per day. These communications are then processed through sophisticated data-mining technologies, which look for simple phrases like “assassinate the president” as well as more complicated communications patterns.

Supposedly Echelon only covers communications outside of the United States. Although there is no evidence that the Bush administration has employed Echelon to monitor communications to and from the U.S., this surveillance capability is probably exactly what the president wanted and may explain why the administration sought to bypass the FISA process of acquiring a warrant for searches.

Honestly, this kind of behavior from the Bush Administration isn’t at all surprising given their “go it alone” attitude. However I’m really dissapointed that the ranking members of the House and Senate Intelligence Committees didn’t make noise–any noise. I imagine they are bound by some oath or whatnot…but what good are checks and balances if they don’t work properly?

Indeed, a recent article from the NYTimes indicates that Schenier’s theory may in fact be, umm fact:

The National Security Agency has traced and analyzed large volumes of telephone and Internet communications flowing into and out of the United States as part of the eavesdropping program that President Bush approved after the Sept. 11, 2001, attacks to hunt for evidence of terrorist activity, according to current and former government officials.

The volume of information harvested from telecommunication data and voice networks, without court-approved warrants, is much larger than the White House has acknowledged, the officials said. It was collected by tapping directly into some of the American telecommunication system’s main arteries, they said.

As part of the program approved by President Bush for domestic surveillance without warrants, the N.S.A. has gained the cooperation of American telecommunications companies to obtain backdoor access to streams of domestic and international communications, the officials said.

I’m really worried that we’re not teetering on a slippery slope but are actually in free fall. It appears that telecommunications companies are helping feed data mining operations at the NSA in real time. Perhaps they have a googlish front end where ‘professionals’ can type in ‘keywords’ and hit “I’m feeling lucky” and get a list of phone conversations or emails.

The Bush Administration’s prolific use of “fear” as an policy wedge is extremely dangerous. As Roosevelt famously said in a time of national crisis:

So, first of all, let me assert my firm belief that the only thing we have to fear is fear itself: nameless, unreasoning, unjustified terror which paralyzes needed efforts to convert retreat into advance.

On a somewhat lighter note, Schneier linked to a little trick devised by Richard M. Smith which allows you to detect if the NSA is monitoring your email communications. As my friend Ed Silva pointed out in IM:

I wouldn’t try it if you are planning on flying.

Uh, yeah I was planning on going to code4lib 2006 in a few months….maybe I’ll wait.

opensearch and autodiscovery

I just noticed that a9 has released a second draft of opensearch v1.1. This draft includes details on opensearch autodiscovery for providing a reference to the opensearch description file in an HTML page. This could have a lot of potential for browser plugins. Also, they’ve added a Query element that can be used for echoing back the query that was used to generate results…kinda like the echoedRequest in SRU. These are the things that popped out at me. Of course the big news in the first draft was that Atom can now be used in responses.

At any rate it was nice to see that they link to my opensearch python library from their tools page. Once 1.1 moves from draft I’m going to work on upgrading it from 1.0 right away.


Thanks to Jessamyn I found Librarians 2.0 don’t need to be coders 2.0 where Richard Ackerman has an interesting take on just how important programming skills are for a library technologist. Richard cites a paper from IBM on Service Oriented Architectures to make a compelling point that there are many roles to play when building technology solutions, particularly the web services that comprise such a big part of “library 2.0” efforts…and that coding skills really aren’t that important when you can just get a student, consultant, or vendor (heh) to do it.

It’s unfortunate I think that the code4lib 2006 conference name seems to emphasize “coding” so much over the ideas. I totally agree that the most important aspect of our work as library technologists are the service ideas, and that the code is simply a machine readable description of these ideas. Some high level languages are actually really, really nice for expressing ideas, and I would argue that often times learning a good computer language can help you express your technology ideas better. As Martin Fowler says:

Any fool can write code that a computer can understand. Good programmers write code that humans can understand.

Let me go on record, as someone who has helped organize the code4lib conference, to say non-coders are more than welcome…there will be plenty of people who can program there…we want ideas, mindshare and collaboration. Please don’t let the computer programming jargon dissuade you from participating.

Also, a few things stood out to me eye:

If your goals and architecture are clear enough, the coders don’t need to be library experts in order to deliver the functions you need.

The coders don’t need to be experts, but wouldn’t it be nice if they were, and you didn’t have to go into great detail about certain things? Wouldn’t it also be nice if the coders didn’t start from scratch and were aware of good reusable components from the library software community which could be leveraged to make the software construction phase that much faster? Indeed, architectural decisions often have a direct effect on programming decisions that are made, and it helps if those who are architecting things have at least a general understanding of how software is built so that designs stay doable and sane…and so that they’ll know when things are drifting of course.

Also, don’t try to build big complex systems. Live in the beta world. Get some chunk of functionality out quickly so that people can play with it. The hardest part is having the initial idea, and the good news is I see lots of great ideas out in the library blogosphere. I can understand the frustrations in the gap between the idea and running code, but I hope I’ve presented a bunch of areas above in which you can work to turn the idea into the next hot beta, without necessarily needing to code it yourself.

The one danger to moving to a formal process like the one described in the article by IBM is that it may encourage you to build big complex systems on a slow time scale. If you need to thoroughly describe a software solution before beginning to program (the so called waterfall model) you will be spend a lot of time trying to get the design right before even beginning to code to see what actually works. I’ve found the more that the design and the coding can be intermingled the better…since it lets them both inform each other as they go on. This intermingling is easy if you are a small shop and you have a handful of people (1-8) that need to communicate on a regular basis. I imagine most software development groups in libraries are around this size. That being said I think that Richard is right, it’s good to be aware of the different roles that are being played, perhaps by one individual.

After seeing Adrian talk about software development and journalism at Snakes and Rubies I’ve been thinking off and on about the space between libraries and software development. I’m particularly interested in how one informs the other…and found Richard’s post to be a good catalyst.