[Dspace-general] Indexing - automatic mapping of plurals and alternate endings?

Vanessa Barrett vanessa.barrett at adelaide.edu.au
Fri Sep 18 00:29:48 EDT 2009


Can anyone confirm my understanding of how DSpace performs keyword
indexing/searching? I suspect that it is doing automatic mapping of singular
and plural forms of words.

 

How I came to this understanding was as follows.

I was searching for an item authored by Alys (alternate spelling of Alice)
Clark.

 

I retrieved three items none of which had the word alys in the metadata or
bitstream.  If I searched for alys on its own I got 168 hits and a cursory
glance at the results list showed that they all had an author with some part
of their name being ali.

 

I then tried searching for each of the following forms

aly, ali, alis, alys alies 

 

All of these as single search terms retrieved exactly the same number of
records - 168.  Results included items with the following strings in
Abstract

- ALIS (Advanced Landmine Imaging System), which is a novel landmine
detection sensor system

- Current ventilatory practices for the management of ALI favor low tidal
volumes

- Current Trends in Periodontal Diagnosis & Disease Recognition in Malaysia
/ T.B. Taiyeb Ali

- Radiology in the acute abdomen / P.G. Devitt, A. Aly, M. Thomas

 

My conclusion is that DSpace is doing some process of mapping plural to
singular forms of words including allowing for alternate endings.  If it is
doing this it is very clever but just a little annoying as Alys is not the
plural of Ali.

 

Also if clever enough to do this why can't it map fiber to fibre and color
to colour which would have much greater benefits in searching a database
that includes North American and European data.

 

Cheers, 

Vanessa Barrett
Digital Services Librarian
The University of Adelaide, AUSTRALIA 5005
Ph    : +61 8 8303 4625
e-mail: vanessa.barrett at adelaide.edu.au

CRICOS Provider Number 00123M
-----------------------------------------------------------
IMPORTANT: This message may contain confidential or legally privileged
information. If you think it was sent to you by mistake, please delete all
copies and advise the sender. For the purposes of the SPAM Act 2003, this
email is authorised by The University of Adelaide. 

Think green: read on the screen.

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/dspace-general/attachments/20090918/47495cee/attachment.htm


More information about the Dspace-general mailing list