thoughts on SHARE

My response to Library Journal’s ARL Launches Library-Led Solution to Federal Open Access Requirements that I’m posting here as well, because I spent a bit of time on it. Thanks for the heads up Dorothea,

https://twitter.com/LibSkrat/status/345148738488115201

In principle I like the approach that SHARE is taking, that of leveraging the existing network of institutional repositories, and the amazingly decentralized thing that is the Internet and the World Wide Web. Simply getting article content out on the Web, where it can be crawled, as Harnad suggests, has bootstrapped incredibly useful services like Google Scholar. Scholar works with the Web we have, not some future Web where we all share metadata perfectly using formats that will be preserved for the ages. They don’t use OpenURL, OAI-ORE, SWORD, etc. They do have lots o’ crawlers, and some magical PDF parsing code that can locate citations. I would like to see a plan that’s a bit scruffier and less neat.

Like Dorothea I have big doubts about building what looks to be a centralized system that will then push out to IRs using SWORD, and support some kind of federated search with OpenURL. Most IRs seem more like research experiments than real applications oriented around access, that could sustain the kind of usage you might see if mainstream media or a MOOC happened to reference their content. Rather than a 4 phase plan, with digital library acronym soup,I’d rather see some very simple things that could be done to make sure that federally funded research is deposited in an IR, and it can be traced back to the grant that funded it. Of course, I can’ resist to throw out a straw man.

Requiring funding agencies to have a URL for each grant, which can be used in IRs seems like it would be the first logical step. Pinging that URL (kind of like a trackback) when there is a resource (article, dataset, etc) associated with the grant would allow the granting institution to know when something was published that referenced that URL. The granting organization could then look at its grants and see which ones lacked a deposit, and follow up with the grantees. They could also examine pingbacks to see which ones are legit or not. Perhaps further on down the line these resources could be integrated into web archiving efforts, but I digress.

There would probably be a bit of curation of these pingbacks, but nothing a big Federal Agency can’t handle right? I think putting data curation first, instead of last, as the icing on the 4 phase cake is important. I don’t underestimate the challenge in requiring a URL for every grant, perhaps some agencies already have them. I think this would put the onus on the Federal agencies to make this work, rather than the publishers (who, like or not, have a commercial incentive to not make it too easy to provide open access) and universities (who must have a way of referencing grants if any of their plan is to work). This would be putting Linked Data first, rather than last, as rainbow sprinkles on the cake.

Sorry if this comes off as a bit ranty or incomprehensible. I wish Aaron were here to help guide us… It is truly remarkable that the OSTP memo was issued, and that we have seen responses from the ARL and the AAP. I hope we’ll see responses from the federal agencies that the memo was actually directed at.