[Dspace-general] Google search results bypass metadata records

Jodi Schneider jodi.a.schneider at gmail.com
Tue Jun 19 18:43:42 EDT 2007


Julie,

Do you know if users were doing *Google* searches (as opposed to Google
Scholar searches)?
I didn't turn up any bitstream results in Google Scholar, but there are
plenty in Google proper.

For instance, I get "about 908" results by searching Google for
site://scholarworks.iu.edu/dspace/bitstream/
For examples of searches that retrieve both bitstreams and metadata records,
a search for
site://scholarworks.iu.edu/dspace/
yields "about 6190" results.

Here are two examples of a bitstream appearing with higher rank than its
metadata record. In these cases I couldn't find the metadata record in the
search results. YMMV.

-Jodi

1) The 14th result in a Google search for
"indiana geological survey" site://scholarworks.iu.edu/dspace/
is a bitstream:**
https://scholarworks.iu.edu/dspace/bitstream/2022/432/1/Steinmetz+et+al+Data+Preservation+at+GSA+23Oct06.ppt
<http://www.google.com/search?hl=en&lr=&client=firefox-a&channel=s&rls=org.mozilla:en-US:official&as_qdr=all&q=related:https://scholarworks.iu.edu/dspace/bitstream/2022/432/1/Steinmetz%2Bet%2Bal%2BData%2BPreservation%2Bat%2BGSA%2B23Oct06.ppt>

I could not find the metadata record ( http://hdl.handle.net/2022/432 aka
https://scholarworks.iu.edu/dspace/handle/2022/432 ) in the (about 187)
search results.

Results in Google Scholar are, of course, different. (No site search is
possible there.)
"indiana geological survey" author:j-steinmetz
gives the metadata record as the third result.
----------------------------------------------------------------------
2)   A Google search for
site://scholarworks.iu.edu/dspace/
gives a bitstream as the sixth result:
https://scholarworks.iu.edu/dspace/bitstream/2022/226/1/B42D.pdf

Metadata record is at http://hdl.handle.net/2022/226 AKA
https://scholarworks.iu.edu/dspace/handle/2022/226/
AKA https://scholarworks.iu.edu/dspace/handle/2022/226/1
I didn't turn up the metadata record in the first 1000 (of "about 6,190")
Google results.
**
I did not find it in Google Scholar by (for instance)
author:Carr,DD sand indiana




On 6/19/07, MacKenzie Smith <kenzie at mit.edu> wrote:
>
> Hi Julie,
>
> This is very interesting... Google Scholar's indexing of item metadata
> vs bitstreams has evolved over the years. Early on they thought users
> were absolutely not interested in seeing item records -- they want to go
> directly to the bitstreams -- but I think that belief has changed and
> now they are definitely indexing both item metadata and bitstreams.
>
> In this case, I wasn't able to reproduce the problem in Google Scholar.
> I searched by terms that were only in the metadata and only in the PDF,
> and in all cases GS took me to the item record first... I couldn't find
> a search that took me directly to the bitstream (although I've seen that
> behavior in the past). Do you know how your user got to that result?
>
> Anyway, assuming there are still cases where GS takes the user directly
> to the bitstream, I think they'd be interested in your feedback. The
> particular bitstream you provided is a great example of why a user might
> need to go through an item record... the bitstream has absolutely no
> embedded metadata or other context to help the user figure out what
> they're looking at.
>
> Maybe Rob Tansley can also comment, or we can take this up again with
> Anurag at Google Scholar.
>
> MacKenzie
> >
> > We've had complaints from users who have found out-of-context DSpace
> > documents with Google searches, such as
> > _https://scholarworks.iu.edu/dspace/bitstream/2022/1333/1/7(1)81-82.pdf_
> > <
> https://scholarworks.iu.edu/dspace/bitstream/2022/1333/1/7%281%2981-82.pdf
> >
> >
> >
> > There is, of course, a DSpace metadata record, but the Google search
> > retrieves the document itself with no link to the metadata record.
> > We're considering disallowing access to the bitstream path so that
> > users will not be confronted with raw documents with no metadata
> > context, but do not want to lose the full-text searching
> > functionality. Has anyone found a solution to this problem?
> >
> > Thanks in advance for help.
> >
> > Julie Bobay, Director for Scholarly Communication Initiatives
> > Indiana University Bloomington Libraries
> > Wells Library E1060
> > Bloomington, IN 47405
> > 812-855-7743
> >
>
>
> --
> MacKenzie Smith
> Associate Director for Technology
> MIT Libraries
>
> _______________________________________________
> Dspace-general mailing list
> Dspace-general at mit.edu
> http://mailman.mit.edu/mailman/listinfo/dspace-general
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/dspace-general/attachments/20070619/c24826e1/attachment.htm


More information about the Dspace-general mailing list