[Dspace-general] Week 3: Good Repository Software

Claudia Jürgen Claudia.Juergen at ub.uni-dortmund.de
Wed Sep 3 06:39:52 EDT 2008


Hi all,

just some general thought, as I can't participate in the weekly chat 
meeting.

A good repository should have a defined scope and concentrate on it's 
core tasks
i.e. to "capture, index, preserve and distribute" the material of it's 
institution and perform these well.
Furthermore it should integrate with it's "natural habitat". In my 
opinion (a repo do-it-all) it might be dangerous to try to convert it to 
an all-in-one device suitable for every purpose.

Taking a university as a use case (the most common one for DSpace) there 
are many business processes (elearning, research evaluation, 
collaborative work, informatione management ...) for which appropriate 
tools exist and form part of the institutions working environment. With 
these a repository should be able to interact and thus form a part of 
the institutions infrastructure.


Sunny greetings

Claudia Jürgen

Dorothea Salo schrieb:
> BLUNTNESS ALERT. This is not a happy email. I have tried to make it a
> reasonably diplomatic one, but I may well have failed, in which case I
> apologize in advance.
> 
> I'm going to tip my hand here. I am speaking from the perspective of a
> repository manager, not a software architect or developer. From that
> perspective, I don't think DSpace is good repository software. From
> where I'm sitting -- on top of a repository that has not met its
> library system's hopes and expectations *and is currently being called
> to account for that* -- it's pretty terrible repository software.
> 
> Let's start from two axioms:
> 
> Axiom 1. The most elegant, stable, reliable, preservation-ready,
> standards-compliant, easy-to-install, easy-to-maintain repository
> software package in the WORLD is completely useless -- worse,
> indefensible -- if it does not attract deposits.
> 
> Do not tell me "that's a political problem" or "that's a
> marketing/outreach problem." In part it is, but the politics at my
> institutions are not lining up behind it, and marketing and outreach
> are useless without a compelling value proposition. It certainly
> doesn't help matters that the repository I run has been an albatross
> thus far, in considerable part due to DSpace's limitations (usage
> reports, anyone?). My local credibility and my ability to argue
> convincingly for resources and support for the repository have been
> heavily tarnished by that. The conclusion is simple: DSpace software
> must present a compelling value proposition *all by itself* -- without
> hacks, without outside software, without local developer support,
> without guarantee of deposit support -- if the repository I run is to
> survive at all. I do not believe that this repository is alone in that
> respect; far from it. I believe that most people in my shoes in the
> United States are living professional lives of (too-)quiet
> desperation.
> 
> Axiom 2. The abovementioned ideal repository software package will not
> attract deposits if it does not solve a problem that depositors (not
> libraries, not IT, not administrators, DEPOSITORS) perceive that they
> have.
> 
> Ergo my answer to this week's question is: Good repository software
> solves depositors' data- and publication-management problems. Let me
> point out, because I've seen this consistently missed, that if I don't
> or can't solve data- and publication-management problems as they come
> to me, the people with those problems won't come to me with their
> other problems *even if I can actually solve them*, nor will they
> recommend the repository to their colleagues. This has happened to me
> over and over and *over* again in the three years I've been running
> DSpace repositories. I am Sisyphus, and that infernal stone keeps
> rolling down the mountain.
> 
> This answer, of course, begs a question: what *are* faculty's data-
> and publication-management problems? Here are some problems that
> faculty at the institutions I serve have admitted to:
> 
> - collaborating on unfinished work across institutional boundaries,
> securely and easily
> 
> - storing and maintaining substantial amounts of data (in highly
> heterogeneous forms) and writing, both while projects are underway and
> afterwards
> 
> - safely storing data that cannot be shown or even hinted at (for a
> wide variety of reasons) to people outside a certain group (often the
> campus or the university system, but sometimes an ad-hoc group)
> 
> - loading their data from their software and their servers into a safe
> storage place, with as little manual intervention as possible,
> preferably none
> 
> - (in some disciplines) coming up with a sustainable data-management
> plan to satisfy grant requirements
> 
> - dealing with electronic works that they want to save, often works by
> third parties such as students; ETDs, of course, but also honors
> projects, graduate/undergraduate research journals, and local
> publications such as newsletters and working-papers series
> 
> - managing their publication record, irrespective of whether they are
> permitted to self-archive some or all of it; use cases include annual
> reviews, tenure-and-promotion packages, and online presence
> 
> - (in some disciplines) coping with funder-mandated requirements for
> open access to published work arising from a grant
> 
> - dealing with electronic materials requiring preservation arising
> from faculty retirements
> 
> None of this should be surprising; the data-curation literature is
> full of these and similar problems. I am leaving digitization support
> out of the picture, not because it isn't important (it is!), but
> because it's a problem DSpace can't feasibly solve -- it's *genuinely*
> a political problem. Still and all, it's worlds harder to solve this
> political problem when DSpace's limitations leave me with a
> credibility deficit to overcome.
> 
> The problem that DSpace was designed for -- self-archiving of
> peer-reviewed journal articles in an institution-based repository for
> purposes of open access -- does not appear on the above list. Bluntly,
> this is because faculty do not perceive self-archiving as a problem
> they have or wish to solve. At the moment, there are two institutions
> in the United States that are entitled to say that some of their
> faculty think otherwise: Harvard and Stanford. DSpace presumably
> wishes to appeal to more than two institutions!
> 
> DSpace can go on being an elegant solution to a nonexistent problem,
> in which case I believe it is doomed, or it can solve problems that
> potential depositors have. Those are its only two choices from where
> I'm sitting. Continuing to proclaim "problems that people actually
> have are out of scope!" is not a viable option. The agreed-upon scope
> has heretofore been hopelessly misdefined. This is not DSpace
> developers' fault, I hasten to say; developers didn't come up with the
> open-access "build it and they will come" ideology which has foundered
> on the rock of faculty apathy.
> 
> This has led to the vast majority of DSpace repositories in the United
> States becoming white elephants. (Wikipedia definition of a white
> elephant: "a valuable possession which its owner cannot dispose of and
> whose cost [particularly cost of upkeep] exceeds its usefulness."
> Right on, Wikipedia. Right on.) I'm sorry if that's unwelcome news.
> It's my daily reality. My career and the repository's continuance are
> riding on me being able to turn that around, and frankly, the odds are
> not presently in my favor -- and I own a lot of that, I willingly
> grant, but DSpace owns some of it too.
> 
> It's worth noting that solving some of the above problems would create
> fertile ground for acquiring appropriate versions of the eventual
> published literature based on the research projects served. Even those
> who are unilaterally committed to open access should support an
> expansion of DSpace's problem-space, because open access gained as a
> byproduct of other solved problems *is still open access*.
> 
> In short, I believe DSpace could do far worse than take on "solving
> depositors' data- and publication-management problems" as its new
> scope, since remaining committed to the old one will mire DSpace in
> irrelevance.
> 
> So. There are a few relatively easy changes that would help me a great
> deal in answering some of the above challenges:
> 
> - True dark archiving: fix the OAI-PMH hole, please! Some collection
> owners do not *want* or *cannot legally leave* collection metadata
> hanging in the breeze. They need to have the option of hiding it
> *completely*, or they walk away from the repository.
> - Embargoes.
> - Bitstream-less items.
> 
> Somewhat more difficult fixes that would have great impact:
> 
> - File versioning.
> 
> - User- and depositor-facing usage reports, as discussed last week.
> 
> - Elimination of per-item licensing, replaced with a single Terms of
> Service click-through. (I can elaborate on this if desired.)
> 
> - Streamlining and simplification of the deposit process, including
> accepting incomplete deposits (even just a file!) for later
> inspection/revision/management by a third party.
> 
> - Better display options for a broader variety of content. I need a
> page-turner, an image browser, and a journal browser that behaves like
> a journal browser -- and that's just for the content I *currently
> have*, not the content I foresee wanting to cope with in future.
> 
> - Easier machine-to-machine deposit. SWORD is good, but frankly, it's
> too hard or out-of-reach for most of the data sources I can imagine. I
> need DSpace to deal with crappy RSS feeds, because crappy RSS feeds
> are what by-author searches of literature databases can produce, and
> local IT folks can usually hack together crappy RSS feeds. I also need
> DSpace to cope with watch folders, because "put it here on the server
> and I'll deal with it" is a value proposition I can sell. So is
> "DSpace will watch your page and ingest any new issues of your
> publication automagically." So is "DSpace can serve as the
> preservation datastore for your
> OJS/OCS/Omeka/ContentDM/Greenstone/Kete/whatever installation."
> 
> - Better hooks for transcluding metadata in other contexts. I want
> one- or no-click publication histories by author, in HTML and RTF at a
> minimum. I want prettily-formatted, logically-organized lists of
> publications in a given collection via a single line of Javascript. I
> want Researcher Pages. (One of the campuses I serve is seriously
> threatening to defect to BePress because of the Selected Works
> feature. This is what I mean when I say that value propositions for
> depositors are *not optional*, *not frills*, in DSpace.) I want COinS
> and RefWorks export.
> 
> I do believe that technological integration with Fedora Commons will
> go a considerable distance toward escaping the shackles bolted on by
> DSpace's too-narrow conception of its mission, and I wholeheartedly
> endorse motion in that direction.
> 
> Whew. Sorry about this; it's a bit of a broadside. All I can say in my
> own defense is that I wouldn't bother if I didn't care as deeply as I
> do. I whinge because I love!
> 
> Repository managers: If any of this rings a bell with you, I need you
> to stand up and say so publicly. "The lurkers support me in email"
> (see <http://www.collectableboard.com/forums/books/44988-hoppys-poisoned-sanctimony.html>)
> is no more going to get these problems solved in future than it has in
> the past.
> 
> Dorothea
> _______________________________________________
> Dspace-general mailing list
> Dspace-general at mit.edu
> http://mailman.mit.edu/mailman/listinfo/dspace-general



More information about the Dspace-general mailing list