[Dspace-general] Week 3: Good Repository Software

Graham Triggs graham at biomedcentral.com
Fri Sep 5 12:34:55 EDT 2008


Dorothea Salo wrote:
> We also need to look at the repository-software market. EPrints does
> this already. I'm pretty sure ContentDM and BePress do as well. How
> long does DSpace remain the solution of choice if it doesn't?

That's a good question. You could also ask how long it would be the 
solution of choice if we simply chase other implementations?

And I mean that in the sense of could we be leapfrogging other 
solutions, rather than simply following what they've done?

> Sure, a lot of counts are going to be low. That doesn't bother me.
> Some won't be -- and just *one* case of "look! fifteen hundred
> downloads in a month!" will make faculty heads pop up. (That's not an
> unreasonable number; it's about what Roach Motel did.) Why would I
> fight that?

Dorothea, you can't simply go around uploading inflammatory material 
into your repositories to get the numbers up ;)

> They can't -- but see all the ongoing arguments about journal impact
> factors. Numbers are stupid, inaccurate, and deceptive. Granted. That
> doesn't stop people wanting numbers, and not just any numbers, not
> just whole-repository numbers, but THEIR numbers.

Numbers that we know are wrong. That probably can't even be compared 
across DSpace implementations, let alone across different platforms. 
Which don't take into account that the same material may be available in 
multiple places. And what's worse, numbers that can't be verified.

If they want these numbers in anything approaching a formal capacity, I 
think we both realise that we might as well just put a random number 
generator in there rather than bother trying to actually count anything!!

At least if they could get those numbers from something like a Google 
Analytics report, there would be some kind of independent validity to 
them. That's not entirely unfeasible, and there is also an upcoming 
competitor - Woopra - that will have an API, potentially allowing for 
integration of those numbers into the repository itself.

I'm not against statistics, but I'm interested to find out if we can / 
should be spending our time on delivering something with less flaws.

> No, we can't recreate Google Docs, but we might be able to do
> something closer to SVN-for-documents. I could sell that... and then
> I'd have access to preprints and postprints that I don't now.

Or you could just use SVN ;) Having all of this inside your 
'preservation' repository is rather sub-optimal - both for the purposes 
of the workspace, and for the long term sustainability of the repository.

Maybe a neater solution would be a workspace / collaboration type 
service that enables all of the gathering of data and people working 
together, with the end result a SWORD submission to the final repository.

(One advantage of that route is that such a service can be applicable to 
EPrints, Fedora, etc. - not just DSpace. Which means you've got a 
potentially wider pool of interested parties to make it happen)

Yes, it's a selling point of DSpace that it's a 'out-of-the-box' 
solution - but that doesn't mean it (or anyone's customised 
implementation of it) should incorporate every aspect of anything that 
touches on the content finding it's way into the repository. The 
repository has a job to do, and there is nothing wrong - in fact, there 
is quite a lot of right - in having a suite of distinct, but integrated, 
services.

> We're getting there, I agree... but we're a long way from sensible
> defaults and sufficient flexibility still. Mediated deposit is a pain
> in the posterior (especially as regards licensing), and it needs badly
> not to be, because it's the standard case (not the edge!) in most
> repositories.

Ahh... the sensible defaults argument. Oh, I quite agree that we could 
come up with something that is more suitable for the 80% case. Right 
now, we're kind of limited in actioning such a change - we have a 
general rule to keep the default configurations as much as possible the 
same, which is really meant to aid with upgrading.

For example, although the submission code has changed dramatically to be 
more flexible in 1.5, the defaults provided mimic the 'out-of-the-box' 
experience of 1.4 as much as possible.

Now, can any one of the developers / committers unilaterally decide - 
actually, let's have the submission process look like this?

If there was a consensus as to what the sensible defaults should be, 
then there could be a commitment to delivering them in future release(s).

G

 
 
This e-mail is confidential and should not be used by anyone who is not the original intended recipient. BioMed Central Limited does not accept liability for any statements made which are clearly the sender's own and not expressly made on behalf of BioMed Central Limited. No contracts may be concluded on behalf of BioMed Central Limited by means of e-mail communication. BioMed Central Limited Registered in England and Wales with registered number 3680030 Registered Office Middlesex House, 34-42 Cleveland Street, London W1T 4LB




More information about the Dspace-general mailing list