[Dspace-general] Statistics
Tim Donohue
tdonohue at illinois.edu
Tue Aug 26 16:29:23 EDT 2008
Dorothea Salo wrote:
> 2008/8/26 Mark H. Wood <mwood at iupui.edu>:
>> On Tue, Aug 26, 2008 at 10:07:43AM -0500, Tim Donohue wrote:
>>> So, although I think it was already mentioned, I'd add as a requirement
>>> for a good Statistics Package:
>>>
>>> * Must filter out web-crawlers in a semi-automated fashion!
>> +1! Suggestions as to how?
>
> The site <http://www.user-agents.org/> maintains a list of
> user-agents, classified by type. They have an XML-downloadable version
> at <http://www.user-agents.org/allagents.xml>, as well as an RSS-feed
> updater. Perhaps polling this would be a useful starting point?
>
> Dorothea
>
Universidade of Minho's Statistics Add-On for DSpace can do some basic
automated filtering of web crawlers:
See its list of main features on the DSpace Wiki:
http://wiki.dspace.org/index.php//StatisticsAddOn
(It looks like they determine spiders by how spiders tend to identify
themselves. Most "nice" spiders, like Google, will identify themselves
in a common fashion, e.g. "Googlebot")
Frankly, although our statistics for IDEALS are nice looking...Minho's
work is much more extensive and offers a greater variety of features
(from what I've seen/heard of it). It's just missing our "Top 10
Downloads" list :)
- Tim
--
Tim Donohue
Research Programmer, Illinois Digital Environment for
Access to Learning and Scholarship (IDEALS)
University of Illinois at Urbana-Champaign
tdonohue at illinois.edu | (217) 333-4648
More information about the Dspace-general
mailing list