Conclusions
For the EM-Loader project, we developed a new workflow and sample user interface which helps researchers maintain their publications list web page and then send it in batch mode to a repository by pressing a 'Deposit' button.
EM-Loader : click to deposit all your publications
This workflow addresses two of the main problems afflicting existing institutional repository deposit systems: researcher motivation and ease of use. The motivation issue is critical - researchers are just not that motivated to populate their local repository, but they do usually care about their public web page and getting more citations. Starting with a tool which helps a researcher maintain their own web page listing publications integrates the deposit process into their normal activities in a way that filling out forms on a library web site does not.
The usability aspect is extremely important too; where possible we import metadata in batch mode from existing sources (pubmed, endnote, web of knowledge, bibtex etc) rather than entering it by hand into a form. Attaching full text is then a matter of finding the appropriate PDF and uploading; no manual form filling is needed.
The workflow brings the effort required to send all publications to a repository to just a few clicks. However, it does still require researchers to choose to press the 'Deposit' button to populate the repository with their latest publications.
The future: zero click deposit
The need for the researcher to press the Deposit button each time they add publications could be removed by switching away from a model where researcher PUSH their items to the repository (via SWORD) to one where the repository does an automatic periodic PULL from the user's publications list. The system could work as follows:
1. Researcher maintains a publications list web page as at present.
When publicationslist.org makes the HTML version, it also generates two Atom feed XML files; one which contains the metadata in HTML (for news readers), and one which contains the same information in JSON-Bibtex, essential for importing the metadata into other tools. Researchers could have a choice of providers of publications list management services - all each would have to do is maintain an Atom feed with structured bibliographic data and links to full text, a much simpler task than coding a SWORD client.2. Researcher tells repository the URL of their publications list
This could be done by pasting a link to the atom feed, or using an API call similar to the 'create repository account' module we wrote for EM-Loader. The same process could notify the publications list tool which repositories are reading the feed too, so the entries in publicationslist could be updated with links to repository copies.3. Repository polls the Atom feed periodically, and fetches any new items
This would require a new repository module which runs as a regular cron task - which could poll the registered atom feeds for new items, create new entries
Advantages of the proposed scheme include:
-
Removes the need for researchers to do anything to deposit
Simply maintaining their publications list web page would be enough to ensure publications are available in the repository too. 'Opt out' checkboxes could be added to individual publications if they should not go in a repository for some reason, but the default 'do nothing' option for researchers could be to deposit everything. -
Much simpler for publications list providers to suppport
A push model like SWORD requires a number of additional user interface components for depositing. A pull model based on Atom simply requires making an XML file available on the web, making it more likely that researchers would get a choice of providers of publications list web pages. People who currently maintain their own HTML web page by hand would also be able to use the system, as generating an Atom XML file is not much harder than coding HTML. -
Much easier to send to multiple repositories
Researchers move between different universities during their career; some repositories only accept research which was done while working at that institution. With a PUSH model like SWORD the researcher has to choose which publications to send to which repositories, making the deposit process more complex. With a PULL model they could simply register their publications list feed with all relevant repositories, and maybe also specify the date ranges they were working where.
Extending EM-Loader to support a PULL model as an alternative to PUSH via SWORD would be a natural follow-on for the project; it would require some investigation and user trials to see which flags / features were needed to integrate with various institutional repository policies, but it offers the potential to make repository deposit require even less effort than the EM-Loader prototype.