[Dspace-general] [Dspace-tech] To Handle or not?
Brad Teale
teale003 at UMN.EDU
Fri Dec 15 16:19:17 EST 2006
Hi Sean,
I'm interested in identifier systems because it's something people do
everyday without thought. For example, if I had the title or isbn (URN)
of a book and went to a library, I would search the stacks using a
binary type search and the book's catalog number (URL). I find the
book, make a note of where I found it and leave. Now if I go back a
year later, would I expect the library to store the book in the exact
same location? If that book isn't in the same location, what would I
do...I'd run another search and go through the process again. If the
book has moved collections, or been dropped entirely I would cast a
wider search net.
Now, in the digital world, why wouldn't we handle this in the same
manner? We just seem to be over complicating a situation that exists in
both worlds. Maybe it can be solved easier in the digital world, but I
don't see that happening... At lease not yet.
More comments in line below:
On 12/15/2006 02:12 PM, Sean Reilly wrote:
[snip]
>> - The idea of Handle is to use a URN, however, the URN RFC (2141) has
>> not gained traction since its creation in May 1997 and browsers don't
>> support URN for the most part.
>
>
> I think RFC 3986 (URI re-specification) attempts to define any URI that
> is used as a name (as opposed to a location) as a URN. I can't say if
> the URN crowd agrees with that, but that would mean that hdl: URI
> schemes would be classified as a URN even if it wasn't under the urn:
> namespace. I agree that URN hasn't really caught on. I only know of
> one native URN resolver and it has not been publicly released to my
> knowledge.
I haven't really looked at RFC 3986, but will give it one now.
[snip]
>> - The Handle server itself acts as a more complicated DNS server. Why
>> add an extra layer over a system that works well. When will we add a
>> system on top of Handle?
>
>
> The Handle service is a separate parallel system to DNS that was
> designed with different intentions, restrictions, and capabilities. It
> was initially designed to offer a (relatively) flat namespace, more
> flexible data types, extreme scalability, security, and the ability to
> administer handles on an individual basis (as opposed to a sysadmin
> updating a zone file and restarting the server).
>
> The handle system is designed to identify fine-grained digital objects
> and has a modern architecture appropriate to that usage.
What about communities, collections, etc? I know that handle is
supposed to match a handle to a digital object/item. However, can that
item be a Dspace community? Can a Handle server take this request?
I've run a few queries and haven't been able to get at anything other
than items. Everything else comes back with a 404 Handle error.
I would imagine that based on current functionality of other resolver
systems, that stripping a '/identifier' would take me up one level or to
the top, yet Handle quietly fails...Handle gets the error, not the
institution.
>> - DNS maps easy to remember names with hard to remember numbers. Handle
>> uses numbers to identify unique institutions. If people have a hard
>> time remembering numbers, why would I choose something like
>> http://hdl.handle.net/1721.1/34898 for my system? Or when will Handle
>> have a DNSish syntax like http://hdl.handle.net/mit.dspace/34898 or
>> something similar? If it already exists why not just use
>> http://dspace.mit.edu/34898?
>
>
> Part of the purpose of using numbers is to avoid embedding semantics in
> the identifier itself, such as the owner of an object or name of a
> collection. This is because owners and administration change (and
> change names). It's not likely that MIT will change their name anytime
> soon, but why would you put the name of the repository software
> (mit.dspace/...) in every document identifier? If that digital object
> were moved to another repository system or to another hosting
> organization the mit.dspace part would be a bit misleading.
>
> My argument for using numbers instead of more readable names is that
> people don't need to remember them - computers do. You are free to use
> readable names in the local part (after the slash) of handle
> identifiers, but issuing readable handle prefixes produces more
> problems (trademark, squatting, etc) than it solves.
I agree that numeric identifiers are better, however, people need to
remember URIs just as much as computers. While I do have portable
computing devices, I don't carry them everywhere. If I'm somewhere and
need a source that isn't in my PDA, laptop or written down, I can
usually remember it because of the URI. I think that using numbers,
while good for machines, is going to hurt the real purpose of the
Handle's mission...linking users to the digital objects they need.
BTW, I noticed many of CNRI documents have easy to remember handles:
http://hdl.handle.net/cnri.dlib/tn95-01
Couldn't help myself... ;P
>> - When you go to a handle URL that doesn't exist (possibly moved or
>> removed), your system doesn't know. You get Handle's 404 page, not the
>> institution that hosts the data, so how are you informed of these
>> requests?
>
>
> We have a new mechanism (not yet fully documented/publicized) that
> allows namespace information to be associated with a handle prefix.
> This info includes contact email address and other bits that can direct
> users of the handle proxy (http://hdl.handle.net) to the person
> responsible for the namespace of the identifier that failed to
> resolve. For an example, try <http://hdl.handle.net/200/
> nonexistenthandle> and check the "contact us" address which has been
> changed for the 200 prefix.
This requires user interaction. Most users don't submit emails like
this. This should be redirected to the institution so they can use the
exiting Apache/Tomcat logs to find these errors. We should not have to
rely on a user telling us separately from their request that there is a
bad link out there. In many instances, I've written 404 error pages to
give a best guess for the object they were looking for, or sent them to
a search page to find it themselves.
The Handle method leaves a disconnect between the user looking for the
item and the host who may have the item.
[snip]
Thanks and hope to hear more.
-Brad
--
Brad Teale Web Application Developer
Digital Library Development Lab University of Minnesota Libraries
teale003 at umn.edu 612-625-0473
More information about the Dspace-general
mailing list