[Dspace-general] Consistency - slight authority control?
Suzanne Pilsk
PilskS at si.edu
Mon Mar 27 11:36:01 EST 2006
Thank you for pointing me to the RePEc project - My take on it is that it is
doing the option that MacKenzie described as making every author be a
registered name.
When I poked around a bit, I saw some authors that have pretty full
information and others that have only the last name and first initial. All
of which is fine as long as the names are consistent in the database - would
be my guess.
How does your name registery link to the actual data submission in DSpace?
Is each author actually given an id code that is the link to the pdf or
whatever it is that is submitted?
Suzanne
Suzanne C. Pilsk
PilskS at si.edu
Direct number:
202-633-1646
Cataloging Services
Smithsonian Institution Libraries
PO Box 37012
Natural History Building, Room 30- MRC 0154
Washington, DC 20013-7012
Unit number:
202-633-1668
>>> "David Goodman" <David.Goodman at liu.edu> 03/26 10:10 PM >>>
There's even an example of doing author names right,
albeit in a single subject field --
Thomas Krichel's RePEc: Research Papers in Economics <http://repec.org/>
Like all reliable methods for coping with the vagaries of
human authors, it requires both proper systems
design, and continuing human input.
Dr. David Goodman
Associate Professor
Palmer School of Library and Information Science
Long Island University
dgoodman at liu.edu
-----Original Message-----
From: dspace-general-bounces at mit.edu on behalf of MacKenzie Smith
Sent: Sun 3/26/2006 5:28 PM
To: Suzanne Pilsk; dspace-general at mit.edu
Subject: Re: [Dspace-general] Consistency - slight authority control?
Hi Suzanne,
I hope the lack of response to your question is due to people's busy-ness
and not just lack of interest...
variations of this question have come up so often that I think you're in
good, if quiet, company.
>I am interested in finding out how DSpace sites are working on
consistency
>of entry for things like names and keywords/subjects. Without a full
blown
>authority control module, we are trying to figure out a way to have
>depositors record names consistently. Has anyone figured out a way or
>workflow that would help?
As you said, there's a bit of support for controlled vocabularies in DSpace
now http://dspace.org/technology/system-docs/submission.html which is ok
when the number of terms is reasonably small and fits in a drop-down box on
the Web submission screen. That feature is normally used for subjects and
other fields with small vocabularies, but doesn't scale to large
vocabularies like LCSH or AAT.
>Currently the standard input form accepts the last name in one box and
>first name and any other parts in another box. This will make sure we
have
>the forms of the name in the right order. Our next goal would be to avoid
>this:
>Doe, J.A.
>Doe, Jane A.
>Doe, J Ann
>Doe, Jane Ann
>
>- especially when Jane Ann Doe is the person submitting the material to
>DSpace!
I know this isn't your particular problem, but I'd like to point out that
there may be two competing needs, which are ably demonstrated in Google
Scholar since it combines metadata from multiple scholarly sources, like
journals, that use different conventions for personal names in their
publications.
-- consistency of name representation is good to help cluster or co-locate
a lot of items by the same person, but
-- using a form of the name that is different from the one used in the
published work can make it harder for a user to find an item, if that's the
only form of the name they know.
In practice, the form of the author's name that is supplied to DSpace is
usually the one that was used in the publication, in the convention of the
particular publisher. That's very convenient for people searching for a
copy of that article from a citation they've found somewhere. It's less
good for finding everything by that author in the repository.
In this situation what we *really* need is the ability to have multiple
representations of the author's name, including a standardized one for
clustering and all the variants that have appeared in publications... which
is pretty complicated to implement of course.
That said, there are many situations where DSpace isn't dealing with formal
publications and it's more desirable to standardize the form of the name so
search results appear together. The two approaches that have been discussed
before are
-- change the submission workflow code to check for the author's name in
DSpace's e-person database table, and force the submitter to select a
registered name. That works as long as all the potential authors are
pre-registered and can be differentiated (i.e. there might still be
multiple John Smiths so the e-person records for each of them have to
contain some value that clearly differentiates them, like a middle name,
birth date, etc.)
-- change the submission workflow code to check for the name in a national
authority file, e.g. using OCLC's name authority Web Service. We tested
that at MIT and it worked great as long as the author was in a national
authority file... very often not the case, e.g. a student who has only
published a thesis. A combination approach of checking he national
authority file followed by a local, institutional authority file (say in
LDAP) could be devised and would cover most cases.
The first approach would be much, much easier of course, so that's probably
the place to start if you can afford to register all your institution's
authors in DSpace.
The programming changes to check for the author would be pretty minor
except for thinking through how to handle cases where there's no match,
more than one matches, etc.
MacKenzie
MacKenzie Smith
Associate Director for Technology
MIT Libraries
Building E25-131d
77 Massachusetts Avenue
Cambridge, MA 02139
(617)253-8184
kenzie at mit.edu
_______________________________________________
Dspace-general mailing list
Dspace-general at mit.edu
http://mailman.mit.edu/mailman/listinfo/dspace-general
More information about the Dspace-general
mailing list