WoGroFuBiCo wc

word	count
library	263
bibliographic	236
data	170
libraries	144
lc	127
control	109
information	98
cataloging	91
records	88
subject	82
materials	81
standards	81
use	80
congress	79
work	76
record	73
community	67
users	61
working	59
group	58
access	57
recommendations	56
resources	53
authority	52
metadata	47
future	46
new	40
environment	37
development	37
web	36
collections	35
systems	35
available	35
creation	35
services	34
headings	32
national	31
findings	30
research	30
unique	29
sharing	29
oclc	28
model	28
catalog	28
international	27
develop	27
value	27
lcsh	26
pcc	26
user	26
need	26
report	25
make	25
practices	25
rda	25
used	25
time	24
needs	24
rare	24
including	24
provide	23
discovery	23
communities	23
special	23
frbr	23
current	22
resource	22
rules	22
digital	21
cooperative	21
program	21
participants	21
management	21
service	20
dc	20
programs	20
online	20
costs	20
washington	20
standard	19
support	19
knowledge	19
different	19
appropriate	19
effort	18
applications	18
marc	18
shared	18
exchange	18
process	18
changes	17
lcs	17
increase	16
public	16
search	16
creating	16
broader	16
catalogs	16
controlled	16

I converted the pdf to text file called ‘lc’ with xpdf and then wrote a little python:

#!/usr/bin/env python

from urllib import urlopen
from re import sub

stop_words = urlopen('http://www.dcs.gla.ac.uk/idom/ir_resources/linguistic_utils/stop_words').read().split()
text = file('lc').read()

counts = {}
for word in text.split():
    word = word.lower()
    word = sub(r'\W', '', word)
    word = sub(r'\d+', '', word)
    if word == ''  or word in stop_words: continue
    counts[word] = counts.get(word,0) + 1

words = counts.keys()
words.sort(lambda a,b: cmp(counts[b], counts[a]))
for word in words[0:100]:
    print "%20s %i" % (word, counts[word])

Does me writing code to read the report count as reading the report? …