federal register embraces the web and opensource

Tom Lee of the Sunlight Foundation blogged yesterday about the new Federal Register website. The facelift was also announced a few days earlier by the Archivist of the United States, David Ferriero. If you aren’t familiar with it already, the Federal Register is basically the daily newspaper of the United States Federal Government, which details all the rules and regulations of the federal agencies. It is compiled by the Office of the Federal Register located in the National Archives, and printed by the Government Printing Office. As the video describing the new site points out, the Federal Register began publication in 1936 in the depths of the Great Depression as a way to communicate in one place all that the agencies were doing to try to jump start the economy. So it seems like a fitting time to be rethinking the role of the Federal Register.

I’m no usability expert, but just a few minutes browsing the new site and comparing it to the old one make it clear what a leap forward this is. Hopefully the legal status of the new site will be ironed out shortly.

Most of all it’s great to see that the Federal Register is now a single web application. The service it provides to the American public is important enough to deserve its own dedicated web presence. As the developers point out in their video describing the effort, they wanted to make the Federal Register a “first class citizen of the web”…and I think they are certainly helping do that. This might seem obvious, but often there is a temptation to jam publications from the print world (like the Federal Register) into dumbed down monolithic repositories that treat all “objects” the same. Proponents of this approach tend to characterize one off websites like Federal Register 2.0 as “yet another silo”. But I think it’s important to remember that the web was really created to break down the silo walls, and that every well designed web site is actually the antithesis of a silo. In fact, monolithic repository systems that treat all publications as static documents to be uniformly managed are more like silos than these ‘one off’ dedicated web applications.

As a software developer working in the federal government there were a few things about the Federal Register 2.0 that I found really exciting:

Fruitful collaboration between federal employees and citizen activist/geeks initiated by a software development contest.
Extensive use of opensource technologies like Ruby, Ruby on Rails, MySQL, Sphinx, nginx, Varnish, Passenger, Apache2, Ubuntu Linux, Chef. Opensource technologies encourage collaboration by allowing citizen activists/technologists to participate without having to drop a princely sum.
Release of the source code for the website itself, using decentralized revision control (git) so that people can easily contribute changes, and see how the site was put together.
Extensive use of syndicated feeds to communicate how how content is being added to the site, ical feeds to keep on top of events going on in your area, and detailed XML for each entry.
The robots.txt file for the site makes the content fully crawlable by web indexers, except for search related portions of the website. Excluding dynamic search results is often important for performance reasons, but much of the article content can be discovered via links, see below about permalinks. They also have made a sitemap available for crawlers to efficiently discover URLs for the content.
Deployment of the web application to the cloud using Amazon’s EC2 and S3 services. Cloud computing allows computing resources to scale to meet demand. In effect this means that government IT shops don’t have to make big up front investments in infrastructure to make new services available. I guess the jury is still out, but I think this will eventually prove to greatly lower the barrier to innovation in the egov sector. It also lets the more progressive developers in government leap frog ancient technologies and bureaucracies to get things done in a timely manner.
And last, but certainly not least … now every entry in the Federal Register has a URL!. Permalinks for the Federal Register are incredibly important for citability reasons. I predict that we’ll quickly see more and more people referencing specific parts of the Federal Register in social media sites like Facebook, Twitter and out on the open web in blogs, and in collaborative applications like Wikipedia.

I would like to see more bulk access to XML data made available, for re-purposing on other websites–although I guess it might be able to walk from the syndicated feeds to the detailed XML. Also, the search functionality is so rich it would be useful to have an OpenSearch description that documents it, and perhaps provides some hooks for getting back JSON and/or XML representations. Perhaps even following the lead of the London Gazette and trying to make some of the structured metadata available in the the HTML using RDFa. It also looks like content is only available for 2008 on, so it might be interesting to see how easy it would be to make more of the historic content available.

But the great thing about what these folks have done is now I can fork the project on github, see how easy it is to add the changes, and let the developers know about my updates to see if they are worth merging back into the production website. This is an incredible leap forward for egov efforts–so hats off to everyone who helped make this happen.