[Dspace-general] Wordle visualization of DSpace content

Robin Taylor robin.taylor at ed.ac.uk
Fri Jul 17 10:04:04 EDT 2009


If I understand correctly, no it won't register Google searches. Its just a wee bit of code added to the search function to store the searches. There is spam protection in the sense that I filter out common words and some offensive stuff. It would be easy to restrict it to searches for specific IP addresses or ranges in order to filter out federated searches or spammers, but I haven't done so as yet... I probably shouldn't mention that on a public mailing list :) 

Cheers, Robin. 

Robin Taylor
Main Library
University of Edinburgh
Tel. 0131 6513808  

> -----Original Message-----
> From: bluyten at gmail.com [mailto:bluyten at gmail.com] On Behalf 
> Of Bram Luyten
> Sent: 17 July 2009 14:53
> To: Robin Taylor
> Cc: dspace-general at mit.edu
> Subject: Re: [Dspace-general] Wordle visualization of DSpace content
> 
> Hi Robin,
> 
> that's very cool, real-time as well, as my search terms 
> appeared almost instantly.
>  Does it register keywords entered in google, that led to the 
> repository, as well ?
> And do you do any spam protection ?
> 
> Innovative visualization can both increase the exposure of 
> the repository's contents and get people enthusiastic about it.
> 
> regards
> 
> Bram
> 
> @mire - http://www.atmire.com
> 
> Technologielaan 9 - 3001 Heverlee - Belgium
> 533 2nd Street - Encinitas, CA 92024 - USA
> 
> http://www.togather.eu - Before getting together, get Tog at ther 
> 
> 
> 
> On Fri, Jul 17, 2009 at 3:40 PM, Robin Taylor 
> <robin.taylor at ed.ac.uk> wrote:
> 
> 
> 	Hi Bram,
> 	
> 	More fluff for the 'fun on Friday' category - I was 
> asked to generate a dynamic Wordcloud of search terms entered 
> into our IR to be flashed up on a big screen in our library. 
> If you interested you can see it at 
> http://www.era.lib.ed.ac.uk/searchQuery (** please use 
> Mozilla as that's what its designed for). As a piece of 'art' 
> its rubbish in comparison with what Wordle can produce, the 
> only interesting thing to come out of the exercise for me was 
> the discovery that 99% of our searches come from federated 
> search engines rather than being entered directly via the UI.
> 	
> 	Cheers, Robin.
> 	
> 	
> 	Robin Taylor
> 	Main Library
> 	University of Edinburgh
> 	Tel. 0131 6513808
> 	
> 
> 	> -----Original Message-----
> 	> From: dspace-general-bounces at mit.edu
> 	> [mailto:dspace-general-bounces at mit.edu] On Behalf Of 
> Bram Luyten
> 	> Sent: 17 July 2009 14:00
> 	> To: dspace-general at mit.edu
> 	
> 	> Subject: [Dspace-general] Wordle visualization of 
> DSpace content
> 	>
> 	> Hello,
> 	>
> 	> In the category, fun on friday, I was curious to investigate
> 	> the results of feeding DSpace item titles into Wordle (
> 	> http://www.wordle.net ), and see what would come up.
> 	>
> 	> Wordle visualizes the occurrence of words for any amount of
> 	> text you feed it. Basically Worlde counts the times a
> 	> specific word occurs, and represents words that occur many
> 	> times large, and words that only occur a few times, smaller,
> 	> in one resulting picture.
> 	>
> 	> As a data source, I used K.U. Leuven's LIRIAS repository (
> 	> http://lirias.kuleuven.be ), a large and rapidly growing
> 	> repository. This DSpace's hierarchy is subject oriented, as
> 	> the communities and collections are organized according to
> 	> the institution's organizational structure. For this
> 	> experiment, I took three top level communities: the
> 	> Biomedical Sciences group, the Humanities and Social Sciences
> 	> group and last (but not least) the Sciences, Engineering and
> 	> Technology group.
> 	>
> 	> Using @mire's reporting suite (
> 	> http://atmire.com/USB/resources/reporting_suite.html ) it
> 	> took me five minutes to generate a clean list of the item
> 	> titles of International Publications (a small subset of the
> 	> content) for each of these top level communities, that were
> 	> submitted in 2009 (500+ for each of these groups).
> 	>
> 	> These lists were used to create following Wordles:
> 	> Humanities and Social Sciences -
> 	> http://www.wordle.net/gallery/wrdl/1003572/K.U._Leuven_Humanit
> 	> ies_and_Social_Sciences_publications_2009
> 	> Biomedical Sciences -
> 	> http://www.wordle.net/gallery/wrdl/1003562/K.U._Leuven_Biomed_
> 	> Publications_2009
> 	> Science, Engineering and Technology -
> 	> http://www.wordle.net/gallery/wrdl/1003577/K.U._Leuven_Science
> 	> %2C_Engineering_and_Technology_publications_2009
> 	>
> 	> It was funny to see that almost all titles were in english
> 	> for the Biomed and SE&T groups. For Humanities and Social
> 	> Sciences, there was a mix between english and dutch titles.
> 	> Wordle allows you to filter the most common words (the, an,
> 	> a, ...) for one particular language. So to clean the
> 	> Humanities & Social Sciences Worldle from both english and
> 	> dutch stop-words, I had to do some manual work on the list.
> 	>
> 	> Although already a sub-selection of three groups was made,
> 	> you still see a lot of "generic" scientific terms, and not so
> 	> many interesting subject keywords. That's quite logic,
> 	> because although the scientists belong to the same group,
> 	> they're still dealing with a variety of subjects.
> 	>
> 	> When zooming in on more specific subjects, here's the Wordle
> 	> from the Computer Science department 2009 publications (one
> 	> subcommunity level below the Groups):
> 	> http://www.wordle.net/gallery/wrdl/1003647/K.U._Leuven_Compute
> 	> r_Science_publications_2009
> 	>
> 	> And even more specific, here's the one for the researchgroup
> 	> of Experimental Radiotherapy, under the Department of
> 	> Oncology in the group of Biomedical sciences. For this one, I
> 	> took all of the publications from 2000-2009 to get a relevant
> 	> selection.
> 	> http://www.wordle.net/gallery/wrdl/1003638/K.U._Leuven_Experim
> 	> ental_Radiotherapy_Publications_2000-2009
> 	>
> 	> best regards,
> 	>
> 	> Bram Luyten
> 	>
> 	> @mire - http://www.atmire.com
> 	>
> 	> Technologielaan 9 - 3001 Heverlee - Belgium
> 	> 533 2nd Street - Encinitas, CA 92024 - USA
> 	>
> 	> http://www.togather.eu - Before getting together, get Tog at ther
> 	>
> 	>
> 	
> 	
> 	
> 	--
> 	The University of Edinburgh is a charitable body, registered in
> 	Scotland, with registration number SC005336.
> 	
> 	
> 
> 
>


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.





More information about the Dspace-general mailing list