[Dspace-general] Google search results bypass metadata records
MacKenzie Smith
kenzie at MIT.EDU
Fri Jun 22 17:12:43 EDT 2007
Hi Christophe,
> Would you support the idea that if the user "messes up" the end of the
> bitstream URL, he/she is redirected to the metadata display ?
I'm not sure what you mean... if the user gets a valid bitstream URL
from Google there's not much we can do to intervene. The user can
already examine the bitstream URL to figure out the item record Handle,
but it's hard to educate random users how to do that. Do you have a
particular scenario in mind?
> My experience with sitemaps (in another application than DSpace) is
> very positive.
> I do not remember if something like http://host/dspace/sitemap is
> returning a sitemap to Google ? does it deals with the limitation of
> 50 (or so) thousands URL/ 10 megabytes per sitemap file?
Sitemaps will help with reducing the strain on DSpace sites when Google
harvests them. Rob Tansley submitted a patch to support Google sitemaps
awhile ago that should be in the next release. The size of the DSpace
repository should not be a problem at all... but I'm not sure how this
would help with the general problem of navigating users from bitstreams
back to item records...
The only ways to do that that I can think of are:
-- alter the bitstream to contain a back link (almost certainly
unacceptable for a preservation archive)
-- prevent Google from harvesting the bitstreams at all (e.g. via the
sitemap) which isn't going to make users very happy... most of the hits
from Google where on keywords from the full-text content files.
But if you have a different problem in mind, or some idea you want to
try, I'm all ears!
MacKenzie
More information about the Dspace-general
mailing list