bagit and .deb
I’m just now (OK I’m slow) marveling at how similar BagIt turned out to be to the Debian Package Format. Given some of the folks involved, this synchronicity isn’t too surprising.
Both .deb and BagIt use a directory ‘data’ for bundling the files in the package (well .deb has it as a compressed file data.tar.gz). Both have md5sum-style checksum files for stating the fixity values of said files. Both have simple rfc2822-style text files for expressing metadata. Both have files that contain the version number of the packaging format. One nice thing that deb has which BagIt intentionally eschewed was a serialization format. But no matter.
At LC we (a.k.a. coding machine Justin Littman) are working on a software library for creating and validating bags, as well as a shiny GUI that’ll sit on top of it to assist in bag creation for people who like shiny things.
It’s an interesting counterpoint to this process of creating BagIt tools to look how a .deb can be downloaded and inspected. Here’s a sampling of a shell session where I downloaded and extracted the parts of the .deb for python-rdflib.
ed@curry:~/tmp$ aptitude download python-rdflib Reading package lists... Done Building dependency tree Reading state information... Done Reading extended state information Initializing package states... Done Building tag database... Done Get:1 http://us.archive.ubuntu.com hardy/universe python-rdflib 2.4.0-4 [276kB] Fetched 276kB in 0s (346kB/s) ed@curry:~/tmp$ ar -xv python-rdflib_2.4.0-4_i386.deb x - debian-binary x - control.tar.gz x - data.tar.gz ed@curry:~/tmp$ tar xvfz control.tar.gz ./ ./postinst ./prerm ./md5sums ./control ed@curry:~/tmp$ cat control Package: python-rdflib Source: rdflib Version: 2.4.0-4 Architecture: i386 Maintainer: Ubuntu MOTU DevelopersOriginal-Maintainer: Nacho Barrientos Arias Installed-Size: 1608 Depends: libc6 (>= 2.5-5), python-support (>= 0.3.4), python (< < 2.6), python (>= 2.4), python-setuptools Provides: python2.4-rdflib, python2.5-rdflib Section: python Priority: optional Description: RDF library containing an RDF triple store and RDF/XML parser/serializer RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information. The library contains an RDF/XML parser/serializer that conforms to the RDF/XML Syntax Specification and both in-memory and persistent Graph backend. . This package also provides a serialization format converter called rdfpipe in order to deal with the different formats RDFLib works with. . Homepage: http://rdflib.net/ ed@curry:~/tmp$ cat md5sums 75af966e839159902537614e5815c415 usr/lib/python-support/python-rdflib/python2.5/rdflib/sparql/bison/SPARQLParserc.so a33eb3985c6de5589cb723d03d2caeb1 usr/lib/python-support/python-rdflib/python2.4/rdflib/sparql/bison/SPARQLParserc.so d1b5578dd1d64432684d86bbb816fafc usr/bin/rdfpipe 0191b561e3efe1ceea7992e2c865949b usr/share/doc/python-rdflib/changelog.gz 98a861211f3effe1e69d6148c1e31ab2 usr/share/doc/python-rdflib/copyright d75c2ab05f3a4239963d8765c0e9e7c5 usr/share/doc/python-rdflib/examples/example.py 17b61c23d0600e6ce17471dc7216d3fa usr/share/doc/python-rdflib/examples/swap_primer.py 3894fa16d075cf0eee1c36e6bcc043d8 usr/share/doc/python-rdflib/changelog.Debian.gz 15653f75f35120b16b1d8115e6b5a179 usr/share/man/man1/rdfpipe.1.gz 405cb531a83fd90356ef5c7113ecd774 usr/share/python-support/python-rdflib/rdflib/sparql/bison/CompositionalEvaluation.py 41e28217ddd2eb394017cd8f12b1dfd5 usr/share/python-support/python-rdflib/rdflib/sparql/bison/Util.py ec9ae5147463ed551d70947c2824bc82 usr/share/python-support/python-rdflib/rdflib/sparql/bison/Resource.py 6e018a69ca242acb613effe420c2cdc7 usr/share/python-support/python-rdflib/rdflib/sparql/bison/SolutionModifier.py 7e72a08f29abc91faddb85e91f17e87c usr/share/python-support/python-rdflib/rdflib/sparql/bison/FunctionLibrary.py 648384e5980ef39278466be38572523a usr/share/python-support/python-rdflib/rdflib/sparql/bison/Expression.py 494386730a6edf5c6caf7972ed0bf4ba usr/share/python-support/python-rdflib/rdflib/sparql/bison/Bindings.py 4513b2fdc116dc9ff02895222a81421d usr/share/python-support/python-rdflib/rdflib/sparql/bison/IRIRef.py a800bdac023ae0c02767ab623dffe67b usr/share/python-support/python-rdflib/rdflib/sparql/bison/Triples.py 6c31647f2b3be724bdfcc35f631162b1 usr/share/python-support/python-rdflib/rdflib/sparql/bison/SPARQLEvaluate.py c158b3fb8fd66858f598180084f481c4 usr/share/python-support/python-rdflib/rdflib/sparql/bison/GraphPattern.py bff095caa2db064cc2b1827c4b90a9e7 usr/share/python-support/python-rdflib/rdflib/sparql/bison/Processor.py 2db0c4925d17b49f5bb355d7860150c2 usr/share/python-support/python-rdflib/rdflib/sparql/bison/QName.py 10e02ecf896d07c0546b791a450da633 usr/share/python-support/python-rdflib/rdflib/sparql/bison/Query.py eee29bb22b05b16da2a5e6552044bf22 usr/share/python-support/python-rdflib/rdflib/sparql/bison/__init__.py a29a508631228f6674e11bb077c24afc usr/share/python-support/python-rdflib/rdflib/sparql/bison/PreProcessor.py 479a4702ebee35f464055a554ebf5324 usr/share/python-support/python-rdflib/rdflib/sparql/bison/Filter.py d2fe75aa4394ec7d9106a1e02bb3015a usr/share/python-support/python-rdflib/rdflib/sparql/bison/Operators.py da186350e65c8e062887724b1758ef80 usr/share/python-support/python-rdflib/rdflib/sparql/Query.py 0130de0f5d28087d7c841e36d89714c4 usr/share/python-support/python-rdflib/rdflib/sparql/graphPattern.py 826ffe4c6b3f59a9635524f0746299fe usr/share/python-support/python-rdflib/rdflib/sparql/sparqlOperators.py ... ed@curry:~/tmp$ tar xvfz data.tar.gz ./ ./usr/ ./usr/lib/ ./usr/lib/python-support/ ./usr/lib/python-support/python-rdflib/ ./usr/lib/python-support/python-rdflib/python2.5/ ./usr/lib/python-support/python-rdflib/python2.5/rdflib/ ./usr/lib/python-support/python-rdflib/python2.5/rdflib/sparql/ ./usr/lib/python-support/python-rdflib/python2.5/rdflib/sparql/bison/ ./usr/lib/python-support/python-rdflib/python2.5/rdflib/sparql/bison/SPARQLParserc.so ./usr/lib/python-support/python-rdflib/python2.4/ ./usr/lib/python-support/python-rdflib/python2.4/rdflib/ ./usr/lib/python-support/python-rdflib/python2.4/rdflib/sparql/ ./usr/lib/python-support/python-rdflib/python2.4/rdflib/sparql/bison/ ./usr/lib/python-support/python-rdflib/python2.4/rdflib/sparql/bison/SPARQLParserc.so ./usr/bin/ ./usr/bin/rdfpipe ./usr/share/ ./usr/share/doc/ ./usr/share/doc/python-rdflib/ ./usr/share/doc/python-rdflib/changelog.gz ./usr/share/doc/python-rdflib/copyright ./usr/share/doc/python-rdflib/examples/ ./usr/share/doc/python-rdflib/examples/example.py ./usr/share/doc/python-rdflib/examples/swap_primer.py ./usr/share/doc/python-rdflib/changelog.Debian.gz ./usr/share/man/ ./usr/share/man/man1/ ./usr/share/man/man1/rdfpipe.1.gz ./usr/share/python-support/ ./usr/share/python-support/python-rdflib/ ./usr/share/python-support/python-rdflib/rdflib/ ./usr/share/python-support/python-rdflib/rdflib/sparql/ ./usr/share/python-support/python-rdflib/rdflib/sparql/bison/ ./usr/share/python-support/python-rdflib/rdflib/sparql/bison/CompositionalEvaluation.py ./usr/share/python-support/python-rdflib/rdflib/sparql/bison/Util.py ./usr/share/python-support/python-rdflib/rdflib/sparql/bison/Resource.py ./usr/share/python-support/python-rdflib/rdflib/sparql/bison/SolutionModifier.py ./usr/share/python-support/python-rdflib/rdflib/sparql/bison/FunctionLibrary.py ./usr/share/python-support/python-rdflib/rdflib/sparql/bison/Expression.py ./usr/share/python-support/python-rdflib/rdflib/sparql/bison/Bindings.py ./usr/share/python-support/python-rdflib/rdflib/sparql/bison/IRIRef.py ./usr/share/python-support/python-rdflib/rdflib/sparql/bison/Triples.py ./usr/share/python-support/python-rdflib/rdflib/sparql/bison/SPARQLEvaluate.py ./usr/share/python-support/python-rdflib/rdflib/sparql/bison/GraphPattern.py ./usr/share/python-support/python-rdflib/rdflib/sparql/bison/Processor.py ./usr/share/python-support/python-rdflib/rdflib/sparql/bison/QName.py ./usr/share/python-support/python-rdflib/rdflib/sparql/bison/Query.py ./usr/share/python-support/python-rdflib/rdflib/sparql/bison/__init__.py ./usr/share/python-support/python-rdflib/rdflib/sparql/bison/PreProcessor.py ./usr/share/python-support/python-rdflib/rdflib/sparql/bison/Filter.py ./usr/share/python-support/python-rdflib/rdflib/sparql/bison/Operators.py ./usr/share/python-support/python-rdflib/rdflib/sparql/Query.py ./usr/share/python-support/python-rdflib/rdflib/sparql/graphPattern.py ./usr/share/python-support/python-rdflib/rdflib/sparql/sparqlOperators.py ...
Here are some more useful notes on the structure of .deb files and how to create them. If you are interested in trying out the nascent-alpha BagIt tools give me a holler (ehs at pobox dot com) or just add a comment here…