1. Introduction
2. User Workflow
3. API Interfaces
4. Metadata
5. Experiences using SWORD
6. Usability
7. Conclusions
Links
<< prevnext >>

Experiences using SWORD for batch deposit

SWORD is an interface based on the Atom publishing protocol which provides scope for wrapping metadata in arbitrary XML formats. You can do a HTTP POST of a zip file containing the full text and an XML metadata file in METS format. We have chosen to embed JSON-Bibtex messages as the metadata payload of the METS file (because a SWORD module for ePrints already supported METS).

// file.zip: contents
   json.xml
   pdf1.pdf

// Sample 'json.xml' file:
<?xml version="1.0" encoding="utf-8" standalone="no"?>
<mets ID="sort-mets_mets" OBJID="sword-mets" LABEL="DSpace SWORD Item"
	PROFILE="DSpace METS SIP Profile 1.0" xmlns="http://www.loc.gov/METS/"
  xmlns:xlink="http://www.w3.org/1999/xlink"
  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://www.loc.gov/METS/ http://www.loc.gov/standards/mets/mets.xsd">
  <metsHdr CREATEDATE="2007-09-01T00:00:00">
    <agent ROLE="CUSTODIAN" TYPE="ORGANIZATION">
      <name>Textensor</name>
    </agent>
  </metsHdr>  
  <dmdSec ID="sword-mets-dmd-1" GROUPID="sword-mets-dmd-1_group-1">
    <mdWrap LABEL="JSON Metadata" MDTYPE="OTHER" OTHERMDTYPE="JSON-BIBTEX"
      MIMETYPE="text/xml">

      <jsonData>
  {"refid":"4",
  "type":"article",
  "title":"A large scale model of the cerebellar cortex using PGENESIS",
  "year":"2000",
  "author":"Howell, F; Dyhrfjeld-Johnsen, J",
  "journal":"Neurocomputing",
  "volume":"32",
  "number":"",
  "pages":"1041-1046",
  "month":"","doi":"","pubmed":"","pdflink":"","urllink":"",
  "abstract":"",
  "note":"",
  "keywords":""}
      </jsonData>

    </mdWrap>
  </dmdSec>
  <fileSec>
    <fileGrp ID="sword-mets-fgrp-1" USE="CONTENT">
      <file GROUPID="sword-mets-fgid-0" ID="sword-mets-file-1"
        MIMETYPE="application/pdf">
        <FLocat LOCTYPE="URL" xlink:href="pdf1.pdf" />
      </file>
    </fileGrp>
  </fileSec>  
  <structMap ID="sword-mets-struct-1" LABEL="structure"
    TYPE="LOGICAL">
    <div ID="sword-mets-div-1" DMDID="sword-mets-dmd-1" TYPE="SWORD Object">
      <div ID="sword-mets-div-2" TYPE="File">
        <fptr FILEID="sword-mets-file-1" />
      </div>
    </div>
   </structMap>
</mets>

The return value is an HTTP error code if something went wrong with authentication, or an ATOM XML string like:

<?xml version="1.0" encoding="UTF-8"?>
<atom:entry xmlns:atom="http://www.w3.org/2005/Atom" 
     xmlns:sword="http://purl.org/net/sword/">
   <atom:id>162</atom:id>
   <atom:author>
      <atom:name>user123</atom:name>
      <atom:email>user123@example.com</atom:email>
   </atom:author>
   <atom:content type="text/xml" 
         src="http://repository.edina.ac.uk/app/collections/162/ServletClient-0"/>
   <atom:link href="http://repository.edina.ac.uk/app/collections/162"
   rel="edit-media"/>
   <atom:link href="http://repository.edina.ac.uk/app/collections/162.atom"
   rel="edit"/>
   <atom:summary type="text">The integrative ambitions
      of systems biology an...</atom:summary>
   <atom:title>Catalyzer: a novel tool for integrating,
      managing and publishing heterogeneous bioscience
      data</atom:title>
   <atom:source>
      <atom:generator uri="http://repository.edina.ac.uk">the
         Depot</atom:generator>
   </atom:source>
   <atom:updated>2008-06-03T15:01:23Z</atom:updated>
   <sword:treatment>Deposited items will remain in your
      work area until you log into the depot and deposit
      them.</sword:treatment>
   <sword:formatNamespace>JSON</sword:formatNamespace>
</atom:entry>

The useful return values are the <atom:id>, the submission ID, and the <atom:updated> timestamp. As can be seen from the examples above, the use of METS and ATOM leads to a rather complex transfer format littered with XML namespace declarations; a lightweight JSON wrapper would be better to remove the need for so much redundant information.

Summary of experiences using SWORD

The main concept behind SWORD - a standard machine interface way to send items to repositories - is an extremely good idea, and an impressive number of repository developers have adopted it especially considering SWORD was developed as a small scale project. In practice, we encountered a number of issues when trying to use it with the ePrints repository: hopefully this experience will contribute to improvements in future versions of SWORD:

Issues using SWORD in practice

Issues with the SWORD module implementation for ePrints

1. Introduction
2. User Workflow
3. API Interfaces
4. Metadata
5. Experiences using SWORD
6. Usability
7. Conclusions
Links
<< prevnext >>