[Dspace-general] DSpace: "digital" archive or "literature" archive?

Derek Hohls DHohls at csir.co.za
Fri Jun 1 05:38:23 EDT 2007


Richard
 
Thanks for sharing those ideas and thoughts.  
 
I looked at the Nuxeo site, and also read through the technical
comparison
by Richard Wyles - very interesting.  I also looked the Fedora case
study
implementation by Richard Green  [sidebar - there do seem to be lots of

Richards here... is it just a coincidence that my middle name is -
Richard!]
 
In summary, I have gathered that:
 
* DSpace is less technically capable, does not scale as well, does not
handle complex objects or variety of objects, or mass-uploading of
data, 
but has an easy and simple front-end for users and administrators.
There
is also a wealth of start-up material and a good community.
 
* Fedora is more technically capable, scales well (within our likely
limits
at least), seems to handle complex objects with a variety of data types
- MIME- 
based.  There is no front-end that works on the web; and the Java
interface
that is supplied looks absolutely barebones at best.  The concepts and
ideas
of Fedora also seem quite complex and are not clearly explained in the
starting
documentation.  User docs and tutorials seem minimal.  Community
support
is unknown.
 
Richard Green's case  study says:
"Fedora 'out of the box' was a software tool with an associated very
steep learning 
curve and a user had to rely heavily on documentation available on the
Fedora 
website... we came to realise that the documentation appeared to lack
some 
crucial elements and that, for a first time user, it was sometimes not
easy to follow."
 
* Nuxeo might be promising; it has lots of flash but the capabilities
are harder
to discern.  The emphasis seems to be on CMS, which is not really what
we need;
from their website list of features:
# Workspaces to create and work on documents
# Flexible versioning of documents 
# Document Life Cycle Management 
# Collaboration features such as comments, on-demand notifications,
etc.
# Search / Query interface to the document repository
 
 
This leaves us in a difficult position between two choices; 
(a) to hold off and hope for Fedora to significantly improve the front
end 
and user documentation... which might be  problematic as its not clear

how there funding will continue after September  this year (2007), 
and there is no project roadmap, so its not that clear as to what they
will 
actually focus on.
(b) to go on with DSpace, and acknowledge that its a temporary
solution
which may not adequately address many of our use cases (although still
a
step up from holding all research data on local drives or on a DMS).
if
we later decide to switch to Fedora, I hope it would be possible to
extract
the content out for the new system.  DSpace says:
http://wiki.dspace.org/index.php//EndUserFaq#Can_I_export_my_digital_material_out_of_DSpace.3F

this is possible....
 
 
Derek
 

>>> Richard MAHONEY <r.mahoney at iconz.co.nz> 2007/06/01 01:26:42 AM >>>

Dear Derek,

On Fri, 2007-06-01 at 00:20, Derek Hohls wrote:
> I have recently installed and started looking at DSpace as a
"digital"
> repository.
>  
> Background:
> I work in a science research organisation.  We are clustered into
> hierarchical groups doing "similar" work, but this structure changes
and
> evolves all the time.  Most of the work we do is in the form of
> projects.  Each project tackles a particular subject, with a
start/end
> date.  As a result of this, any number of digital "objects" are
> generated: PDF's, images, presentations, reports, spreadsheets, data
> files, model runs outputs, program code, spatial files etc. 
Usually,
> such material is archived on CD and kept "somewhere".  
>  
> The organisation does run a formal Document Management System (DMS);
> this is typically used for project reports and has the facilities of
> document security control, access, version tracking etc.  Its also
> integrated into other tools we use.  
>  
> Problem Statement:
> I need to provision a system that can be used a complete "digital"
> archive; that stores *all* digital information in an accessible and
> easily retrievable manner, with easy uploading/downloading of
material
> into the archive.
>  
> Impression of DSpace:
> My early, high expectations of DSpace have been tempered somewhat as
I
> have started looking at the interface in more detail.  My impression
so
> far is that DSpace seems designed as primarily for occasional storage
of
> literature-type of material, within the framework of a stable
> organisational framework, whereas I am looking for frequent storage
of
> widely varying material within a shifting organisational framework,
> accompanied by ongoing staff turnover.
>  
> I really would like some input from the existing community -
especially
> those that may have similar experience in this kind of environment,
> whether or not DSpace is the tool to use.  In particular, some of
the
> worrying limits I have seen so far are ...

[snip]

I have been using DSpace for over a year now -- 1.3.x and 1.4.x -- on
Solaris 10 with Sun's Java System Web Server (6.1 and 7.0). I use
DSpace for Indica et Buddhica - Repositorium: a digital archive
designed to capture, store, index, preserve, and distribute materials
pertinent to Indology and South Asian Buddhology. While the aim is to
build an archive that enables Indologists and Buddhologists to
catalogue and store a variety of materials -- articles, books, images,
theses, software, working papers and so on -- the main concern at
present is to lay the foundation by filling the archive with relevant
bibliographical records. This is underway and almost 25,000 records
are
available already, the same number again should be loaded within the
next few weeks. More details here:

http://indica-et-buddhica.org/sections/repositorium-preview 

You are at the critical stage of selecting and assessing an archival
platform so I will try to address your concerns candidly.

I am currently using DSpace only as -- for me -- there is presently no
suitable alternative. While I was impressed by proven scalability of
Fedora, the lack of a decent Java web app. admin. and user interface
ruled it out. (I prefer to avoid PHP apps if possible,  and last time
I
tried Fez it consistently crashed Sun's Web Server -- completely
unacceptable on a test server, let alone in production.) Another suite
capable of scaling was CDS Invenio (a.k.a. CDSWare). Unfortunately it
is rather complex to compile, configure and maintain on Solaris so is
not currently an option. Unfortunately, all that is really left is
DSpace, with its well known performance and scalability issues.

Although these shortcomings have been raised many times on the mailing
lists I seen no evidence that they are being addressed with anything
but lip service. The discouraging findings of this technical
evaluation, I believe, still hold:

a.) Technical Evaluation of Research Repositories (Richard Wyles
- 2006-09-14 16:49)
https://eduforge.org/docman/?group_id=131 


>From my own perspective, then, I see DSpace as nothing but a
temporary
solution until a good Java web app. is developed for Fedora. Another
alternative, perhaps more likely in the short term, is Nuxeo's soon to
be released Java app. Nuxeo 5. I am already using their Zope based CPS
4 for the front end of my site and very happy with it. Nuxeo claims
that CPS 4 has been tested and approved with more than 3TB of live
data
(3 million documents). It is intended that version 5 will effortlessly
scale to over 5TB. This will need to be assessed, but early
indications
are convincing. Below are a few references. You may like to note that
their current development is being driven by the needs of clients
perhaps not so very different from your own:


i.) Nuxeo Home Page:

http://www.nuxeo.com/ 

ii.) CPS Project Page:

http://www.cps-project.org/ 

iii.) About the Zope to Java technology switch (CPS 4 to Nuxeo 5):

http://www.nuxeo.com/en/java-switch/ 

iv.) Nuxeo 5 Project Page:

http://www.nuxeo.com/en/products/ 

http://www.nuxeo.org/static/snapshots/ (Download Daily Snapshots)

v.) Nuxeo 5 Roadmap

http://www.nuxeo.org/sections/about/roadmap/ 


vi.) Nuxeo Clients:

http://www.nuxeo.com/en/customers/ 

vii.) Mailing Lists (Nuxeo 5):

http://lists.nuxeo.com/mailman/listinfo/ecm 



Best regards,

Richard Mahoney


-- 
Richard MAHONEY | internet: http://indica-et-buddhica.org/ 
Littledene      | telephone/telefax (man.): +64 3 312 1699
Bay Road        | cellular: +64 27 482 9986
OXFORD, NZ      | email: r.mahoney at indica-et-buddhica.org 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Indica et Buddhica: Materials for Indology and Buddhology
Repositorium: http://indica-et-buddhica.org/repositorium/ 
Philologica: http://indica-et-buddhica.org/philologica/ 
Subscriptions: http://subscriptions.indica-et-buddhica.org/ 



-- 
This message is subject to the CSIR's copyright, terms and conditions and
e-mail legal notice. Views expressed herein do not necessarily represent the
views of the CSIR.
 
CSIR E-mail Legal Notice
http://mail.csir.co.za/CSIR_eMail_Legal_Notice.html 
 
CSIR Copyright, Terms and Conditions
http://mail.csir.co.za/CSIR_Copyright.html 
 
For electronic copies of the CSIR Copyright, Terms and Conditions and the CSIR
Legal Notice send a blank message with REQUEST LEGAL in the subject line to
CallCentre at csir.co.za.


This message has been scanned for viruses and dangerous content by MailScanner, 
and is believed to be clean.




More information about the Dspace-general mailing list