Wednesday, January 7, 2009
Q: What do 100 year old knitting patterns and a lost Robert Louis-Stevenson story have in common?
A: A digitally preserved newspaper page.
Q: What about if you add:
URIs for knitting materials
William Blake’s Engravings
The similarities/differences between XMPP, HTTP and NNTP
Web crawling as data integration
Project coordination with rooms on FriendFeed
brewing Kombucha
A: Just a typical lunch time conversation at [...]
Wednesday, November 26, 2008
Some folks at LC and CDL are trying to kick-start a new public discussion list for talking about digital curation in its many guises: repositories, tools, standards, techniques, practices, etc. The intuition being that there is a social component to the problems of digital preservation and repository interoperability.
Of course NDIIPP (the arena for the [...]
One little bit of goodness that has percolated out from my group at $work in collaboration with the California Digital Library is the BagIt spec (more readable version). BagIt is an IETF RFC for bundling up files for transfer over the network, or for shipping on physical media. Just yesterday a little article about BagIt [...]
Thanks to a tip from Ian, I’m looking forward to (hopefully) attending the Linked Data Planet conference in New York City as a volunteer. The idea is that I just have to pay for my hotel, and the cost of admission is waived. It seems my travel money is a bit limited at the moment [...]
Recently there was a bit of interesting news around a MARBI Discussion Paper 2008-DP04 regarding semweb technologies at LC.
Related to this work are RDF/OWL representations and models for MODS and MARC, which we are also developing. Several representations of MODS in RDF/OWL, such as the one from the SIMILE project, have been made [...]
access accessible addition al american analysis application applications appropriate archives areas association authority available based benefit benefits bibliographic broad broader catalog catalogers cataloging catalogs cataloguing chain change changes classification code collaboration collections committee communities community congress consequences consider considered content continue control controlled cooperative cost costs create created creating creation current data databases dc description [...]
Thursday, January 10, 2008
word
count
library
263
bibliographic
236
data
170
libraries
144
lc
127
control
109
information
98
cataloging
91
records
88
subject
82
materials
81
standards
81
use
80
congress
79
work
76
record
73
community
67
users
61
working
59
group
58
access
57
recommendations
56
resources
53
authority
52
metadata
47
future
46
new
40
environment
37
development
37
web
36
collections
35
systems
35
available
35
creation
35
services
34
headings
32
national
31
findings
30
research
30
unique
29
sharing
29
oclc
28
model
28
catalog
28
international
27
develop
27
value
27
lcsh
26
pcc
26
user
26
need
26
report
25
make
25
practices
25
rda
25
used
25
time
24
needs
24
rare
24
including
24
provide
23
discovery
23
communities
23
special
23
frbr
23
current
22
resource
22
rules
22
digital
21
cooperative
21
program
21
participants
21
management
21
service
20
dc
20
programs
20
online
20
costs
20
washington
20
standard
19
support
19
knowledge
19
different
19
appropriate
19
effort
18
applications
18
marc
18
shared
18
exchange
18
process
18
changes
17
lcs
17
increase
16
public
16
search
16
creating
16
broader
16
catalogs
16
controlled
16
I converted the pdf to text file called ‘lc’ with xpdf and then wrote a little python:
#!/usr/bin/env python
from urllib import urlopen
from re import sub
stop_words = urlopen(’http://www.dcs.gla.ac.uk/idom/ir_resources/linguistic_utils/stop_words’).read().split()
text = file(’lc’).read()
counts = {}
for word in text.split():
word = word.lower()
word = sub(r’\W’, ”, word)
word = sub(r’\d+’, ”, word)
[...]
Monday, December 31, 2007
I opened the paper this morning to read a story of another person involved in the creation of MARC who has just died. I hadn’t realized before reading Henrietta Avram and Samuel Snyder’s obituaries that there was a bit of an NSA LC connection when MARC was being created.
From 1964 to 1966, [Samuel Snyder] [...]