[Dspace-general] Week 3: Good Repository Software
Graham Triggs
graham at biomedcentral.com
Thu Sep 4 11:33:14 EDT 2008
Dorothea Salo wrote:
> BLUNTNESS ALERT. This is not a happy email. I have tried to make it a
> reasonably diplomatic one, but I may well have failed,
At least I know it's going to be entertaining! Well, my warning shall be
that this isn't intended to be a rebuttal, more counterpoint. It's great
that all this has been said, but if you take [another] step back, does
it still look the same?
> Axiom 1. The most elegant, stable, reliable, preservation-ready,
> standards-compliant, easy-to-install, easy-to-maintain repository
> software package in the WORLD is completely useless -- worse,
> indefensible -- if it does not attract deposits.
>
> Do not tell me "that's a political problem" or "that's a
> marketing/outreach problem." In part it is, but the politics at my
> institutions are not lining up behind it, and marketing and outreach
> are useless without a compelling value proposition. It certainly
> doesn't help matters that the repository I run has been an albatross
> thus far, in considerable part due to DSpace's limitations (usage
> reports, anyone?).
I'm going to bite on this one, as I want to ask a serious question -
should usage reports play a part in encouraging deposits?
Would seeing low usage reports DIScourage people that would otherwise
choose to submit? If you are looking at counts of accesses and
downloads, how reliable can they ever be?
My personal opinion is that usage reports at an item/bitstream level is
something of an itch you can't scratch - if you try, it might ease an
initial sore point, only for it to recur in another guise later on.
Those kinds of usage reports are better suited at a high level 'is this
repository wanted' question, and things like Google Analytics answer
that far better than anything we could ever build in.
Value at a finer level would (again, imho) be better accounted for in
MESURes like citation counts. Is that really something that can or
should be built in to DSpace? (it could just as easily be an entirely
separate system that a DSpace installation could query to obtain the
relevant information for display).
As I say, the above is just a personal take, so people are free to
disagree. I just want to shake things up a little and see if people have
thought about the problem from different angles [than just saying
'statistics in DSpace'].
> This answer, of course, begs a question: what *are* faculty's data-
> and publication-management problems? Here are some problems that
> faculty at the institutions I serve have admitted to:
>
> - collaborating on unfinished work across institutional boundaries,
> securely and easily
An interesting use case, but not necessarily one that can be solved by
an institutional repository, or ever should be solved by something that
is set up for preservation. (I'm deliberately leaving that point for
further exapnsion later).
If you are wanting to collaboratively edit a document, for example,
would the better answer be to use Google Docs? *That* level of
collaborative ability is way off the scope for what we could ever hope
to put in to a repository. Rather, the question should be how can we
better support external collaboration tools - ie. easy ingesting of a
Google Doc into a repository.
> - storing and maintaining substantial amounts of data (in highly
> heterogeneous forms) and writing, both while projects are underway and
> afterwards
I rather covered the editing side above, but there may be a lot of
[relatively] static data that is associated with a paper. It may not
need to be updated, but it does need to be stored somewhere - and if it
is eventually going to be in the repository, why not have it there from
the start rather than managing it until the point of submission? It's an
interesting point.
> - loading their data from their software and their servers into a safe
> storage place, with as little manual intervention as possible,
> preferably none
I see that as being quite related to the above - as much of a users
output should be captured [seamlessly] as part of their ongoing work.
> None of this should be surprising; the data-curation literature is
> full of these and similar problems. I am leaving digitization support
> out of the picture, not because it isn't important (it is!), but
> because it's a problem DSpace can't feasibly solve -- it's *genuinely*
> a political problem. Still and all, it's worlds harder to solve this
> political problem when DSpace's limitations leave me with a
> credibility deficit to overcome.
>
> The problem that DSpace was designed for -- self-archiving of
> peer-reviewed journal articles in an institution-based repository for
> purposes of open access -- does not appear on the above list. Bluntly,
> this is because faculty do not perceive self-archiving as a problem
> they have or wish to solve. At the moment, there are two institutions
> in the United States that are entitled to say that some of their
> faculty think otherwise: Harvard and Stanford. DSpace presumably
> wishes to appeal to more than two institutions!
I have sympathy for your plight. But I think (not necessarily in you, I
hasten to add ;) that there may be some element of not accepting
problems as political ones, because frankly their are only so many of
these battles that we can take on (and win), and because we can point to
their possibly being a technical solution to a limited selection of
cases that have been encountered to date.
I'm not saying that to have an argument, but from the other side there
are only so many technical problems that can be taken on and solved in a
certain period of time. If one of these problems could be tackled
politically, would that mean our time would be better spent solving
other issues?
> - Bitstream-less items.
that's already been done ;) (all developers down the pub then, right?)
> Somewhat more difficult fixes that would have great impact:
>
> - File versioning.
I agree about it being difficult. There was GSoC code for this that will
make it into a future release.
Although I'll make my obligatory statement about the need for this may
be exaggerated - there are practises a repository can adopt for managing
it's items that would alleviate a number of potential cases, and in
relation to the points above about collaboration - there are better ways
to address the problem *before* hitting the repository.
> - Elimination of per-item licensing, replaced with a single Terms of
> Service click-through. (I can elaborate on this if desired.)
I agree with the sentiment. Lot's of issues to think about though in the
wider scope - ie. how do you deal with updating the Terms of Service.
> - Streamlining and simplification of the deposit process, including
> accepting incomplete deposits (even just a file!) for later
> inspection/revision/management by a third party.
A lot of this could already be achieved solely through the configuration
files - and maybe this could be aided by one or two 'non-interactive'
submission steps being provided in the default DSpace, ready for
configuring.
> - Better display options for a broader variety of content. I need a
> page-turner, an image browser, and a journal browser that behaves like
> a journal browser -- and that's just for the content I *currently
> have*, not the content I foresee wanting to cope with in future.
Isn't this about 90% outside of the scope of DSpace. A page turner? -
that could just be a flash object that you give the url of a PDF to. If
there is something out there already that offers that, it's fairly
minimal effort for someone to customise their interface to use it - you
don't need to get 'inside' DSpace.
I can see why visualizations are useful, but that isn't a reason for
DSpace itself to do anything more than make it possible to easily
integrate third party objects. If someone finds or wants to provide such
objects that can be redistributed with DSpace, that's a bonus.
G
This email has been scanned by Postini.
For more information please visit http://www.postini.com
More information about the Dspace-general
mailing list