Terry’s analysis of the proposed changes to OCLC’s record policy is essential reading. I’m really concerned that these 996 fields will slip somewhat unnoticed into data that I use.
996 $aOCLCWCRUP $iUse and transfer of this record is governed by the OCLC® Policy for Use and Transfer of WorldCat® Records. $uhttp://purl.org/oclc/wcrup
This appears to be an engineered, [...]
Wednesday, April 30, 2008
If you happen to be in the DC area on May 8th and are interested in linked data and the practical application of semantic web technologies like RDF and OWL please join us at the Library of Congress for a presentation by Alistair Miles, key developer of SKOS, and semantic web practitioner at the University [...]
access accessible addition al american analysis application applications appropriate archives areas association authority available based benefit benefits bibliographic broad broader catalog catalogers cataloging catalogs cataloguing chain change changes classification code collaboration collections committee communities community congress consequences consider considered content continue control controlled cooperative cost costs create created creating creation current data databases dc description [...]
Thursday, January 10, 2008
word
count
library
263
bibliographic
236
data
170
libraries
144
lc
127
control
109
information
98
cataloging
91
records
88
subject
82
materials
81
standards
81
use
80
congress
79
work
76
record
73
community
67
users
61
working
59
group
58
access
57
recommendations
56
resources
53
authority
52
metadata
47
future
46
new
40
environment
37
development
37
web
36
collections
35
systems
35
available
35
creation
35
services
34
headings
32
national
31
findings
30
research
30
unique
29
sharing
29
oclc
28
model
28
catalog
28
international
27
develop
27
value
27
lcsh
26
pcc
26
user
26
need
26
report
25
make
25
practices
25
rda
25
used
25
time
24
needs
24
rare
24
including
24
provide
23
discovery
23
communities
23
special
23
frbr
23
current
22
resource
22
rules
22
digital
21
cooperative
21
program
21
participants
21
management
21
service
20
dc
20
programs
20
online
20
costs
20
washington
20
standard
19
support
19
knowledge
19
different
19
appropriate
19
effort
18
applications
18
marc
18
shared
18
exchange
18
process
18
changes
17
lcs
17
increase
16
public
16
search
16
creating
16
broader
16
catalogs
16
controlled
16
I converted the pdf to text file called ‘lc’ with xpdf and then wrote a little python:
#!/usr/bin/env python
from urllib import urlopen
from re import sub
stop_words = urlopen(’http://www.dcs.gla.ac.uk/idom/ir_resources/linguistic_utils/stop_words’).read().split()
text = file(’lc’).read()
counts = {}
for word in text.split():
word = word.lower()
word = sub(r’\W’, ”, word)
word = sub(r’\d+’, ”, word)
[...]