[Dspace-general] RE: Dspace-general Digest, Vol 3, Issue 13

Angela Ramnarine aramnarine at library.uwi.tt
Wed Oct 22 13:24:21 EDT 2003


Already done.

Angela

-----Original Message-----
From: dspace-general-request at mit.edu
[mailto:dspace-general-request at mit.edu]
Sent: Wednesday, October 22, 2003 12:01 PM
To: dspace-general at mit.edu
Subject: Dspace-general Digest, Vol 3, Issue 13


Send Dspace-general mailing list submissions to
	dspace-general at mit.edu

To subscribe or unsubscribe via the World Wide Web, visit
	http://mailman.mit.edu/mailman/listinfo/dspace-general
or, via email, send a message with subject or body 'help' to
	dspace-general-request at mit.edu

You can reach the person managing the list at
	dspace-general-owner at mit.edu

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Dspace-general digest..."


Today's Topics:

   1. DSpace 1.2 Feature Descriptions (Richard Rodgers)
   2. Re: spam alert (MacKenzie Smith)


----------------------------------------------------------------------

Date: 21 Oct 2003 14:40:46 -0400
From: Richard Rodgers <rrodgers at MIT.EDU>
To: dspace-general at MIT.EDU
Subject: [Dspace-general] DSpace 1.2 Feature Descriptions
Message-ID: <1066761646.25056.73.camel at dspace-03.mit.edu>
Content-Type: text/plain
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Precedence: list
Message: 1


As promised, what follows are descriptions of several of the proposed
new features for DSpace 1.2. Please post comments and suggestions to
this list: we welcome your input!

Content Thumbnail Support

In DSpace's item display graphical content currently only has text to
describe it - users would like to see thumbnails, a more intuitive
representation.  Clicking on the thumbnail would then deliver the
content to the user.  Users may want to supply their own thumbnails, but
the vast majority would probably prefer to have the system generate
thumbnails for them.

There will probably be four methods of adding thumbnails to an item: 
the system could generate automatically during the workflow of
submission, a batch process accessing the repository could add thumbnail
bundles to items lacking them, or users could submit thumbnails along
with the item (highly unlikely!), or the system could generate
thumbnails dynamically.  To integrate thumbnail generation into a
workflow, DSpace needs to be able to take advantage of all of the tools
available to generate thumbnails, and many are not written in Java. 
Many of these tools are also specific types of content - some are image
tools, PDF utilities, or a even a script tying multiple tools together. 
A 'plug-in' architecture could handle these scenarios well:  a group of
classes wrap whatever tools are needed to generate thumbnails, and
register for certain types of submitted content. As content is
submitted, its type is checked and it is handed off to the
classes that want to process submitted content of that type.
A batch tool could also be created that used the same registry of
content handlers, looking for items without thumbnails, and then
attempting to create thumbnails when possible. 

Storing thumbnails shouldn't involve too many changes to the system. 
(None, if the thumbnails are generated dynamically.)  DSpace was
architected with alternate views of content in mind, where items can
have multiple bundles, each containing a different representation of the
item's content.  A thumbnail could simply be another bundle within the
item.  The item display page could look for a thumbnail bundle
containing images for each bitstream in the primary bundle (we may need
a type field to identify the primary bundle vs. a PDF or thumbnail or
extracted text bundle,) and then display the thumbnail next to the file
name. 

The thumbnail becomes an official part of the item, or a flag could be
used to indicate that the thumbnail is an annotation by the system
rather than a part of the original submission.


Full Text Searching

Currently DSpace users can only search the metadata for items - the text
that may be within the content is not searchable.  Users would like to
search the full text of items within DSpace.  It may also be handy for
users to have access to the extracted text for an item, possibly in the
'full' item display.

Our search engine Lucene can easily index the full text from items, so
the challenge is really extracting and storing the full text of
submitted items. This problem is remarkably similar to the generation of
thumbnails: generating a 'text' representation of an item's content is
very similar to generating a thumbnail representation of that content. 
DSpace's object model supports different representations of content with
bundles - each bundle stores a representation. 

Like thumbnail generation, there are many tools available to extract
text from content, many of which are not in Java, and many are specific
to certain types of content.  Again, a plug-in architecture would handle
representation generation well - as content is submitted, classes
registered for the type of that content are invoked and annotate the
item with a full-text bundle, which would then be recognized and indexed
by the search system.  A plug-in architecture would be handy for
integrating with workflows, or as part of a batch process to be run as
part of regular content 'maintenance.'  Again, these bundles may need to
be typed; in this case a 'full text' bundle type would be a hint for the
indexer that it could index the contents of that bundle.

Also like thumbnails, the extracted text for content becomes an official
part of the item, perhaps with a flag to indicate that it is an
annotation by the system and was not part of the original submission.



Items Shared by Multiple Collections

Currently DSpace assumes that items are part of a single collection. 
Users would like to share items between collections, even generation
'virtual' collections that are groupings of items from other
collections.

DSpace's data model supports mapping items to multiple collections, but
the GUI tools do not.  If an item is shared between collections, then
the question arises over who controls it.  One solution is to assign an
owning collection to an item.  The administrator of the owning
collection can modify the item, and assign viewing permissions - other
collection administrators do not have such control - they can only place
a reference to the item in their collection.  Administrators who could
not access an item themselves would of course not be able to reference
the item in their collection.  Since items in multiple collections are
references and not copies, if an item is for some reason removed or
withdrawn, then the references will also appear to be removed or
withdrawn.  A possible problem in the future to watch out for will be
when collection administrators want to attach metadata to or annotate
these references.


------------------------------

Date: Tue, 21 Oct 2003 14:39:38 -0400
From: MacKenzie Smith <kenzie at MIT.EDU>
To: "JQ Johnson" <jqj at darkwing.uoregon.edu>
Cc: dspace-general at MIT.EDU
Subject: Re: [Dspace-general] spam alert
Message-ID: <5.2.1.1.2.20031021142524.01e3bdd8 at hesiod>
In-Reply-To: <MMENIDDNIPBIJLHLNFHBMECFCMAA.jqj at darkwing.uoregon.edu>
References: <5.2.1.1.2.20031020124444.02289e40 at po11.mit.edu>
Content-Type: text/plain; charset="us-ascii"; format=flowed
MIME-Version: 1.0
Precedence: list
Message: 2

Hi JQ,

dspace-general is using MIT's centrally managed listserv services
and mail relay, which are run by our central computer services group.
They've been working with AOL, spamcop, etc to demonstrate how
they're eliminating spam and to introduce smtp authentication. As
far as we know, spamcop should be removing MIT from their list soon.

Beyond this, there's not much we in the Libraries can do about it,
but if actual spam is becoming a problem then we'll convert the list
to a moderated one.

Thanks for the heads up,

MacKenzie/

At 10:30 AM 10/20/2003 -0700, JQ Johnson wrote:
>Our institutional spam filters have started to complain about mail to
>dspace-general at mit.edu.  Here's the report for email I got just now:
>
>  pts rule name              description
>---- ----------------------
------------------------------------------------
>--
>  1.9 WEIRD_PORT             URI: Uses non-standard port number for HTTP
>  2.2 RCVD_IN_BL_SPAMCOP_NET RBL: Received via a relay in bl.spamcop.net
>                   [Blocked - see
><http://www.spamcop.net/bl.shtml?18.7.21.86>]
>  1.3 RCVD_IN_NJABL_RELAY    RBL: NJABL: sender is confirmed open relay
>                             [18.7.21.86 listed in dnsbl.njabl.org]
>  0.1 RCVD_IN_NJABL          RBL: Received via a relay in dnsbl.njabl.org
>                             [18.7.21.86 listed in dnsbl.njabl.org]
>
>We'd be very appreciative if MIT could clean up the software on
>melbourne-city-street.mit.edu to eliminate the potential for its use as an
>open relay.  If that can be done, then it will disappear from the NJABL and
>SPAMCOP blacklists.
>
>JQ Johnson                      Office: 115F Knight Library
>Academic Education Coordinator  mailto:jqj at darkwing.uoregon.edu
>1299 University of Oregon       phone: 1-541-346-1746; -3485 fax
>Eugene, OR  97403-1299          http://darkwing.uoregon.edu/~jqj/
>
>
>
>_______________________________________________
>Dspace-general mailing list
>Dspace-general at mit.edu
>http://mailman.mit.edu/mailman/listinfo/dspace-general

MacKenzie Smith
Associate Director for Technology
MIT Libraries
Building 14S-208
77 Massachusetts Avenue
Cambridge, MA  02139
(617)253-8184
kenzie at mit.edu 

------------------------------

_______________________________________________
Dspace-general mailing list
Dspace-general at mit.edu
http://mailman.mit.edu/mailman/listinfo/dspace-general


End of Dspace-general Digest, Vol 3, Issue 13
*********************************************


More information about the Dspace-general mailing list