Archive for the ‘html’ Category

nekkid

Thursday, April 5th, 2007

Yeah, today is CSS Naked Day I just hope I remember to re-enable CSS tomorrow :-)

exhibit

Friday, February 16th, 2007

If you haven’t tried Exhibit out yet the simile folks have created a truly wonderful data publishing framework which runs entirely in your browser with a bit of javascript, html and css.

The remarkable part is that it requires no backend database, but simply operates on a stream of json. If you have a couple minutes take a look at their Getting Started Tutorial which shows you how to create a exhibit of MIT related nobel laureates with a tiny bit of HTML, CSS and JavaScript.

Just as an experiment I tried pointing it at my delicious json feed for metadata. It turns out that exhibit wants json data to be a hash with a key ‘items’ that points to a list of items. In addition it also wants each item to have a ‘label’ key. I quickly reformatted the delicious json with simplejson, and got this.

A few minutes later I prodded the simile folks to see if there is a way of filtering json data on the way into exhibit so that it can be normalized…time passes (like maybe an hour) and then I hear from Johan Sundström that the latest/greatest exhibit code has this sort of filtering built in!

Tangential to the exhibit code, there has been an interesting discussion recently about how to expose exhibit content to indexing services like google. Since exhibit content is generated with pure javascript, and google (as far as we know) primarily indexes html content–the exhibit content is rendered invisible. This is a problem that digital library applications and repositories have to deal with as well, so it may be of interest.

use the source luke

Friday, October 6th, 2006

Can you imagine (back in the day) going to a page like the one at O’Reilly’s Safari doing a view-source in Mosaic and trying to learn HTML and how the web works?

sigh…

set your data free … with unapi

Monday, August 28th, 2006

Dan, Jeremy, Peter, Michael, Mike, Ross and I wrote an article in the latest Ariadne introducing the lightweight web protocol unAPI. Essentially unAPI is an easy way to include references to digital objects in your HTML which can then be predictably retrieved by a machine…yes ‘machine’ includes JavaScript running in a browser :-) Dan and a really nice cross section of developers around the world have been working on this spec for over a year now and I think it could be poised to play an important role in the emerging open data movement.

Imagine you have a citation database which is searchable via the web. The search results include hits. Wouldn’t it be nice to align your human viewable results with machine readable representations so that people could write browser hacks and the like to remix your application data?

As far as I can tell there are a few options available to help you do this (apart from doing something ad-hoc).

  1. use a citation microformat and mark up your HTML predictably so that it can be recognized and parsed
  2. use GRDDL to map your HTML to RDF via an XLST profile.
  3. embed RDF in your HTML essentially using an RDF microformat.
  4. OpenURL and/or COinS to link in page IDs to OpenURL servers.
  5. use unAPI and include a unapi server url (familiar autodiscovery like RSS/Atom), and identifiers (simple element attributes) and write a simple server side script that emits xml for a given identifier.

I like microformats a lot and I think a citation format will eventually get done. But it’s been a long time coming and there’s no indication it’s going to get done any time soon. What’s more unAPI is bigger than just citation data–and it allows you to publish all kinds of rich data objects without waiting for a community to ratify a particular representation in HTML.

Options 2 and 3 use RDF which I actually like quite a bit as well. GRDDL implies a GRDDL aware browser which would be cool but is a bit heavy weight. XSLT will require clean XHTML–or pipelines to clean it. Embedding RDF in HTML using microformat techniques is compelling because you can theoretically process the RDF data similarly–whereas unAPI doesn’t require any particular kind of machine readable format (apart from HTML). Actually there’s nothing stopping you from using unAPI to link human viewable objects with RDF representations. The advantage unAPI has here is you can learn RDF if you want to, but you don’t have to learn RDF to get going with unAPI today.

Option 4 leverages work done in the library community on citation linking. OpenURL routers are widely deployed in libraries around the world and COinS is a quasi-microformat for putting OpenURL context objects into your HTML so that they can be extracted and fired off at an OpenURL server. OpenURL is a relatively complex and subtle standard which can do a lot more than just citation linking. Compared to OpenURL/COinS unAPI allows for ease of implementation in languages like JavaScript and provides a simple introspection mechanism for discovering what formats a particular resource is available in. AFAIK this can’t be done simply using OpenURL/COinS. If I’m wrong, comments should be open. I would argue that the sheer power and flexibility of OpenURL paradoxically make it hard to understand…and that unAPI in Dan’s adherence to a one-page-spec is more limited and simple. Less is more…

So if this piques your interest read the article. It does a much better job of describing the origins of the work, where it’s headed, has examples and links out to sites/tools that use unAPI today. I must admit I wrote very little of the article, and mostly contributed text snippets and screenshots of the unAPI validator I wrote, which uses my unapi ruby gem.

hGoogle

Thursday, April 13th, 2006

So it’s been noted elsewhere that the latest ajaxy application out of google labs (Google Calendar) lacks support for the hCalendar microformat.

Perhaps it’s an oversight–but with all the high profile exposure microformats have been getting lately it’s kind of hard to imagine. But people have deadlines and some things just can’t make it into the first release–even at Google. The main thing, as Mark Pilgrim says is:

Sniping from the sidelines makes us look petty and insular. Instead
of making assumptions about big bad evil Google ignoring open
standards and locking users in, have we tried opening a dialogue?

I don’t know anyone at google so I feel like I’m doing my part by just blogging about how *awesome* it would be if they marked up their calendar data using hCalendar. As a full featured calendaring application on the web, Google Calendar could really enable downstream applications like the LiveClipboard if they simply added some class attributes and spans to the data they are already displaying.

In the long run I imagine it’s in Google’s best interests to promote microformats since their infrastructure would allow them to take best advantage of a system of distributed metadata. Here’s to hoping that it’ll be layered in sometime soon. In the meantime Scott and Mark have the right idea!

By the way, being able to enter a quick event in free text and have the time/location/description parsed as opposed to tabbing around in a complicated form is very nice.

Translation and a Citation Microformat

Saturday, April 1st, 2006
I can think of only one company that has the resources to embed translation links into the world’s existing body of printed material. What’s more, while they are at it they are going to markup the title page with a citation microformat…and get this…the microformat is based on a openurl XMDP profile so that it’ll interoperate with existing citation resolvers in use in libraries around the world…niiiice.

reading 2.0

Monday, March 20th, 2006

Reading 2.0 slipped under my radar, but I guess that was the idea: to let people from O’Reilly, Los Alamos National Labs, OCLC, The Internet Archive, Adobe, Yahoo, Harvard and Elsevier hobnob away from prying eyes. I haven’t seen any audio/video for the event but Tim O’Reilly has a nice fly on the wall summary of what went on.

It’s refreshing to see library technologies/concepts such as OpenURL, OCOinS, OAI-PMH, FRBR, METS and Dublin Core starting to be talked about in the context of a larger information environment. For example I had no idea that Yahoo is harvesting data from the Internet Archive using the OAI-PMH protocol. And I didn’t know Yahoo is starting to leverage microformats, but should’ve guessed considering the recent news about Flickr starting to use hCard.

All in all these are exciting “lowercase semantic web” times we’re living in. And it’s interesting to watch some of the things people you know have worked on starting to catch on. Hopefully Reading 2.0 was just the start of this ongoing collaboration. Case in point, I just heard Robert Sanderson say in #code4lib that he’s visiting the a9 folks to talk about opensearch and sru. This is just the sort of cross-fertilization we need going on in library land.

openurl as microformat

Wednesday, January 18th, 2006

The Search

Author: John Battelle

Year: 2005

Publisher: Portfolio Hardcover

ISBN: 1591840880

Ok, so The Search is a great book so far…but I’m really just testing some local modifications I made to the structured blogging tool to use Book OpenURL KEV parameter names as a microformat. Take a look in the HTML and you should see them hiding there.

Here’s a somewhat prettified version as an image since I couldn’t get my syntax highlighter plugin to do a nice enough job with the HTML.

Pretty simple stuff right? Notice the COinS in there too? That’s thanks to Dan’s hacking at structured blogging. Actually getting openurl KEV support into structured blogging is another idea of Dan’s. Go Chudnov.

Update 01/19/2006 09:39 CST: Dan got similar support for journal articles. If this stuff caught on it could really revolutionize academic blogging…and more.

trackbacks at arXiv

Thursday, September 22nd, 2005

I just read (thanks jeff) about how arXiv.org has implemented experimental trackback support. Essentially this allows researchers who maintain online journals to simply reference an abstract like File-based storage of Digital Objects and constituent datastreams: XMLtapes and Internet Archive ARC files (a great article by the way) and arXiv will receive a trackback ping at http://arxiv.org/trackback/0503016 that lets them know someone referenced the abstract. If you’ve followed this so far you might be wondering how the blogging software (wordpress, moveabletype, blosxom, etc) figure out where to ping arxiv.org. Take a look in the source code for the arXiv abstract and you’ll see a chunk of RDF:


<rdf:RDF xmlns:rdf='http://www.w3.org/1999/02/22-rdf-syntax-ns#'
  xmlns:dc='http://purl.org/dc/elements/1.1/'
  xmlns:trackback='http://madskills.com/public/xml/rss/module/trackback/'>
<rdf:Description
  rdf:about='http://arxiv.org/abs/cs/0503016'
  dc:identifier='http://arxiv.org/abs/cs/0503016'
  dc:title='File-based storage of Digital Objects and constituent datastreams: XMLtapes and Internet Archive ARC files'
  trackback:ping='http://arxiv.org/trackback/cs/0503016' />
</rdf:RDF>

So when you are finished composing a blog entry (like this one), wordpress will look at outbound links in blog entries, follow the URLs, look for trackback RDF in the HTML source, and then actually ping the respective trackback server. Pretty fancy to have all this stuff just happening automatically…and it’s great to see how arXiv is continuing to blur the lines between electronic and traditional publishing models.

The fact that trackback autodiscovery uses RDF is a nice illustration for folks who are skeptical about the semantic web (note case). I’m no expert, but I do think that the semantic web revolution will be a quiet one, and will not be televised (easily viewable).

HTML/HTTP

Friday, April 29th, 2005

RFC 2397 has been around since August 1998 and I’m just learning about the data URL scheme today. Perhaps browser support for it is new? Basically data URLs allow you to embed data, like images directly in an HTML page. Data URLs remind me of Fred Lindberg’s old idea (circa 2001) of “mailing the web” by freezing web pages as email with MIME attachments.

It’s fun to be learning new things about HTML/HTTP: technologies that I thought I was familiar with already. Perhaps I’ve been out of web development for long enough to fall behind. The other day I learned about iframes from my friend and sometime coworker Jason and was similarly blown away by something new under the sun. iframes are esentially the same things as regular frames but for the browser user they don’t see separate panes. Useful for scrolling panels inside of pages and other things I’m sure.

I guess this is all part of the web renaissance that is going on now, spurred on by Google’s forays and investment in javascript and xml. It’s really interesting to see how a big player like Google can redefine what is acceptable technology to rely on in web applications. For years I’ve avoided doing too much in javascript since it was a headache to get it working across different browsers, at least for this programmer. Now javascript is on my list of things to learn more about.