[Dspace-general]

Jason Simms jsimms at utk.edu
Thu May 13 09:34:59 EDT 2004


Hello Everyone,

 From those of you who are using DSpace in any decent capacity, I would 
like to know how you are actually tackling the process of entering 
items into the repository.  For instance, we are in the process of 
creating a digital collection of slides for a campus department.  The 
process of entering the images into DSpace is laborious (not to mention 
the workflow involved with simply digitizing and organizing the 
physical slides in the first place), and I cannot think of any 
time-saving methods.

Everyone knows that the batch import tools have some issues of 
usability and could be improved.  In any event, because this is not a 
legacy digital collection, none of the images have metadata associated 
with them, so the XML files would have to be manually created right 
along with the directory structure for the batch import, which 
therefore to my mind seems more time-consuming than simply entering 
them individually through the DSpace web interface.  On this note, how 
are people creating compliant XML files for use with the batch 
importer, if indeed anyone is doing so?  By hand?  Specialized 
Perl/shell tools?  Without some advanced knowledge of XML, programming, 
UNIX commands, and related technologies, entering items by this route 
is largely impossible, meaning that a highly competent "technology" 
person probably must be in charge of entering the data, or at least of 
tool creation.  Even if a useful script is built that abstracts the 
data entering process so that anyone can do it, the end result is a 
Perl or similar script that basically mirrors the functionality of the 
web interface anyway.

Of course, entering everything by hand through the web interface is an 
exceptionally lengthy process, requiring several screens of clicking 
and data entry.  Even with a fast worker, perhaps only one slide every 
minute or so is a good time, and our collection is somewhere around 
8,000 images.  Without a full-time worker dedicated to only this one 
job, the process quickly becomes almost insurmountable in any 
reasonable timeframe.

So, how are other institutions managing this troublesome process?

--
Jason Simms
Computer Programming and Design
University of Tennessee, Knoxville
865.974.8508



More information about the Dspace-general mailing list