[Dspace-general] Consistency - slight authority control?

Suzanne Pilsk PilskS at si.edu
Mon Mar 27 11:36:01 EST 2006


Thank you for pointing me to the RePEc project - My take on it is that it is
doing the option that MacKenzie described as making every author be a
registered name.  

When I poked around a bit, I saw some authors that have pretty full
information and others that have only the last name and first initial.  All
of which is fine as long as the names are consistent in the database - would
be my guess.

How does your name registery link to the actual data submission in DSpace?
Is each author actually given an id code that is the link to the pdf or
whatever it is that is submitted?

Suzanne



Suzanne C. Pilsk
PilskS at si.edu
Direct number: 
202-633-1646

Cataloging Services 
Smithsonian Institution Libraries
PO Box 37012
Natural History Building, Room 30- MRC 0154
Washington, DC 20013-7012
Unit number:
202-633-1668


>>> "David Goodman" <David.Goodman at liu.edu> 03/26 10:10 PM >>>
There's even an example of doing author names right,
albeit in a single subject field --
Thomas Krichel's RePEc: Research Papers in Economics  <http://repec.org/>
Like all reliable methods for coping with the vagaries of 
human authors, it requires both proper systems 
design, and continuing human input.
 
Dr. David Goodman
Associate Professor
Palmer School of Library and Information Science
Long Island University
dgoodman at liu.edu 



-----Original Message-----
From: dspace-general-bounces at mit.edu on behalf of MacKenzie Smith
Sent: Sun 3/26/2006 5:28 PM
To: Suzanne Pilsk; dspace-general at mit.edu 
Subject: Re: [Dspace-general] Consistency - slight authority control?
 
Hi Suzanne,

I hope the lack of response to your question is due to people's busy-ness 
and not just lack of interest...
variations of this question have come up so often that I think you're in 
good, if quiet, company.

>I am interested in finding out how DSpace sites are working on
consistency
>of entry for things like names and keywords/subjects. Without a full
blown
>authority control module, we are trying to figure out a way to have
>depositors record names consistently. Has anyone figured out a way or
>workflow that would help?

As you said, there's a bit of support for controlled vocabularies in DSpace

now  http://dspace.org/technology/system-docs/submission.html which is ok 
when the number of terms is reasonably small and fits in a drop-down box on

the Web submission screen. That feature is normally used for subjects and 
other fields with small vocabularies, but doesn't scale to large 
vocabularies like LCSH or AAT.

>Currently the standard input form accepts the last name in one box and
>first name and any other parts in another box. This will make sure we
have
>the forms of the name in the right order. Our next goal would be to avoid
>this:
>Doe, J.A.
>Doe, Jane A.
>Doe, J Ann
>Doe, Jane Ann
>
>- especially when Jane Ann Doe is the person submitting the material to
>DSpace!

I know this isn't your particular problem, but I'd like to point out that 
there may be two competing needs, which are ably demonstrated in Google 
Scholar since it combines metadata from multiple scholarly sources, like 
journals, that use different conventions for personal names in their 
publications.

-- consistency of name representation is good to help cluster or co-locate

a lot of items by the same person, but
-- using a form of the name that is different from the one used in the 
published work can make it harder for a user to find an item, if that's the

only form of the name they know.

In practice, the form of the author's name that is supplied to DSpace is 
usually the one that was used in the publication, in the convention of the

particular publisher. That's very convenient for people searching for a 
copy of that article from a citation they've found somewhere. It's less 
good for finding everything by that author in the repository.

In this situation what we *really* need is the ability to have multiple 
representations of the author's name, including a standardized one for 
clustering and all the variants that have appeared in publications... which

is pretty complicated to implement of course.

That said, there are many situations where DSpace isn't dealing with formal

publications and it's more desirable to standardize the form of the name so

search results appear together. The two approaches that have been discussed

before are

-- change the submission workflow code to check for the author's name in 
DSpace's e-person database table, and force the submitter to select a 
registered name. That works as long as all the potential authors are 
pre-registered and can be differentiated (i.e. there might still be 
multiple John Smiths so the e-person records for each of them have to 
contain some value that clearly differentiates them, like a middle name, 
birth date, etc.)

-- change the submission workflow code to check for the name in a national

authority file, e.g. using OCLC's name authority Web Service. We tested 
that at MIT and it worked great as long as the author was in a national 
authority file... very often not the case, e.g. a student who has only 
published a thesis. A combination approach of checking he national 
authority file followed by a local, institutional authority file (say in 
LDAP) could be devised and would cover most cases.

The first approach would be much, much easier of course, so that's probably

the place to start if you can afford to register all your institution's 
authors in DSpace.
The programming changes to check for the author would be pretty minor 
except for thinking through how to handle cases where there's no match, 
more than one matches, etc.

MacKenzie


MacKenzie Smith
Associate Director for Technology
MIT Libraries
Building E25-131d
77 Massachusetts Avenue
Cambridge, MA  02139
(617)253-8184
kenzie at mit.edu  

_______________________________________________
Dspace-general mailing list
Dspace-general at mit.edu 
http://mailman.mit.edu/mailman/listinfo/dspace-general 




More information about the Dspace-general mailing list