Skip to content

Tag Archives: beautifulsoup

New York Times Topics as SKOS

Serves 23,376 SKOS Concepts
INGREDIENTS

Text editor: Vim, Emacs, TextMate, etc
Python
BeautifulSoup
rdflib
Internet connection

DIRECTIONS

Open a new file using your favorite text editor.
Instantiate an RDF graph with a dash of rdflib.
Use python’s urllib to extract the HTML for each of the Times Topics Index Pages, e.g. for A.
Parse HTML into a fine, queryable data structure using BeautifulSoup.
Locate topic names and [...]