[Dspace-general] Re: [Dspace-tech] Structured metadata scheme "pretend" to be Dublin Core?

instituto A.C.Jobim iacj at terra.com.br
Fri Apr 29 23:40:52 EDT 2005


dear Peter and Robert
I have a kind of similar problem in the Institute.
I migrated a database with some related tables into dspace.
I would like to rebuild the relationship I had (authority records) with 
the new dublin core schema.
there ia a field in the dcvalue table that is never used (source_id?). 
I feel very tempted to use it to store a foreing key to an authority 
record.
if we had such a column we could hold some data together creating such 
a structure.
Lets say:

item_id         dc_type_id              text_value             
text_lang           place      source_id
xx                  contributor               joe                       
    en                       1              1                 (joe's id 
in an authority record)
xx                  contributor.role       painter                   en 
                       1              1
xx                  contibutor                mary                      
  en                        2              35
xx                  contributor.role       dancer                   en  
                       2              35

in fact I tried to create a new field just for that but dspace didn't 
like it and complained.
Today we just created dc qualifiers for every kind and we are planing a 
way to fill the contributors field reading from another table.
the example would be just:

contributor.painter                      joe
contributor.dancer                      mary

the other form would be more flexible because today if a new role 
appears I have to create a new dc qualifier, and alter the input forms 
and views.
is there any plans to include this kind of structure?
thanks Paulo Jobim



Em 2005Apr29, às 17:43, Peter Urban escreveu:

> I'm bringing this offline conversation to this list so that the 
> information
> is available to anyone else facing similar challenges and to see what
> others' thoughts are. See original messages below for context. Sorry 
> for the
> long post.
>
> Robert asked if there is any structure to the CDWA metadata fields. He 
> says,
> "If the answer is no, CDWA is flat, you can just use and extend the 
> current
> DSpace Dublin Core support to include CDWA elements." Based on his
> description, CDWA has structure. For example:
>
> CREATOR
> - IDENTITY (can be linked to an authority record)
> - ROLE
> DATE
> - EARLIEST DATE
> - LATEST DATE
>
> Do I understand correctly that a structured schema (as opposed to a 
> flat
> one) cannot be added to the current metadata registry? If so, can you
> explain why? If not, and there is an option of "pretending" that CDWA 
> is
> just an extension of Dublin Core, what are the pros and cons of doing 
> so?
> Would we need to make a distinction between what is DC and what is CDWA
> pretending to be DC? Might there be downstream issues related to 
> OAI-PMH
> support?
>
> Thanks!
>
> Peter Urban
> Kristine Fallon Associates, Inc.
> Digital Archive for Architecture System
> The Art Institute of Chicago
> purban at kfa-inc.com
>
>
>
>> -----Original Message-----
>> From: Tansley, Robert [mailto:robert.tansley at hp.com]
>> Sent: Tuesday, April 12, 2005 2:33 PM
>> To: Peter Urban
>> Subject: RE: DSpace for Art Institute of Chicago
>>
>>
>> Hello Peter,
>>
>> I'd recommend you widen this conversation to the dspace-devel
>> or -tech list as you'll get faster responses then!  The model
>> of interaction around the DSpace open source community
>> doesn't tend to be people asking specific individuals for
>> help, as no one actually has general "DSpace support" as a
>> full-time job...
>>
>>> 	Option 2. I'm not sure I understand what you mean by
>>> "flat schema".
>>> CDWA has categories and subcategories. If you'd like to take
>>> a look -
>>> http://www.getty.edu/research/conducting_research/standards/cd
>>> wa/index.html
>>> <http://www.getty.edu/research/conducting_research/standards/c
>>> dwa/index.html
>>>> . Are you saying that if CDWA is like DC, we could extend the DC
>>>> schema
>>> within DSpace to include all of the additional CDWA fields
>>> and make them available via the customized UI in 1.2.2 beta
>>> 1? For submission and for search? What are the pros/cons?
>>
>> 'Categories and sub-categories' are orthogonal to whether a
>> schema is 'flat'.  Categories are just possible values that a
>> metadata field can have.
>> What I mean is, is there any structure to the metadata fields
>> themselves?
>> e.g.
>>
>> title:  Piece of work
>> author:   John Doe
>> author:   Jane Smith
>> abstract:   This piece of work blah blah...
>> category:  X123
>>
>> is flat.  The following has structure:
>>
>> title:   Piece of work
>> author:
>>     name:    John Doe
>>     organisation:   HP
>> author:
>>     name:    Jane Smith
>>     organisation:   MIT
>>
>> i.e. it's not just a list of name/value pairs.  "Author"
>> doesn't just have one value, it has a value comprised of two
>> 'sub-parts'.  My question is does CDWA contain any elements like that?
>>
>> If the answer is no, CDWA is flat, you can just use and
>> extend the current DSpace Dublin Core support to include CDWA
>> elements.  The categories are more a UI concern (drop-lists
>> and the like) and the custom submit form feature (I believe)
>> supports this.
>>
>>> 	Option 3. I am not aware of there being an XML schema
>>> for CDWA. If we were to generate the CDWA metadata as XML,
>>> and include that XML as a bitstream in the item, and we had a
>>> crosswalk to DC, are you saying that the CDWA metadata could
>>> then be searched using an external interface into DSpace? My
>>> previous understanding was that metadata stored as a
>>> bitstream could not be searched. If it can be searched (even
>>> externally to DSpace), this sounds like it could be worth
>>> further investigation.
>>> 	
>>> 	If my understanding of Option 3 is correct, could you
>>> provide a high-level description of how this would work? For
>>> example, would we have a package of bitstreams that include
>>> the CDWA metadata XML, then import (or
>>> upload) those bitstreams as an item? How would the DC
>>> metadata be pulled from the CDWA metadata XML? How would the
>>> CDWA metadata be searched? What would happen if any of the
>>> CDWA metadata needed to be updated after the item has been
>>> archived? Can the CDWA metadata XML bitstream be modified
>>> after the item has been archived?
>>
>>  If metadata is stored in an item as XML in a bitstream, it
>> means the core DSpace system itself cannot index that
>> metadata.  Other systems (search engines, or DSpace add-ons)
>> that understood that XML could peer inside those XML
>> bitstreams and index the relevant data, and allow it to be searched.
>>
>> There are different APIs and protocols that those search
>> engines or DSpace add-ons can use to do this.  This also
>> applies to getting the metadata into DSpace in the first
>> place, and modifying it further down the line (which is
>> possible).  Basically, option 3 means that you leave DSpace
>> as is, and implement various bolt-ons that do the work with
>> CDWA via the DSpace Java APIs or network protocols.  The
>> advantage of this over option 4 (hacking the
>> code) is that your code is separate from the core DSpace
>> code, and is easier to maintain; as long as the DSpace
>> APIs/protocols are backwards compatible your code will work.
>>  Robert Tansley / Digital Media Systems Programme / HP Labs
>>   http://www.hpl.hp.com/personal/Robert_Tansley/
>>
>>
>>
>>>> -----Original Message-----
>>>> From: Tansley, Robert [mailto:robert.tansley at hp.com]
>>>> Sent: Wednesday, April 06, 2005 10:46 AM
>>>> To: Peter Urban
>>>> Subject: RE: DSpace for Art Institute of Chicago
>>>>
>>>>
>>>> Hello Peter,
>>>>
>>>> Firstly, apologies for the delayed reply, your mail arrived just as 
>>>> I
>>>> left for a 2-week business trip which meant it got snowed under.
>>>>
>>>> I'm glad to see you hope you can do this in a way that can be
>>>> contributed back to open source DSpace!  This will benefit you and 
>>>> the
>
>>>> Art Institute, in that you won't have to maintain the code by
>>>> yourself, the rest of the open source community will help you.
>>>>
>>>> I'm not familiar with the CDWA standard, so I'm not 100% sure how
>>>> extensive the modifications would have to be.  In general, you have 
>>>> a
>>>> number of options:
>>>>
>>>> *		Wait for another project to implement support
>>>> for different
>>>> metadata schemas, for example the SIMILE project (simile.mit.edu).  
>>>> I
>>>> suspect this won't be a good option for you, as there's no
>>>> real time
>>>> frame for when this might happen.  The SIMILE project hasn't engaged
>>>> the DSpace community with any concrete plans yet, so we've no way of
>>>> knowing when this might happen.  You could offer to support such an
>>>> effort.
>>>> *		If the metadata standard looks somewhat like Dublin Core
>>>> (i.e. a flat schema), you can "pretend" the metadata is Dublin Core.
>>>> As of DSpace 1.2.2 beta 1, most use of Dublin Core is configurable
>>>> (e.g. which fields appear in the submission UI, advanced search 
>>>> etc).
>>>> *		Another option is to store the CDWA in XML
>>>> (does CDWA have
>>>> an XML Schema)? as another bitstream in each DSpace item, and have a
>>>> crosswalk that creates Dublin Core for DSpace.  Other systems
>>>> interfacing with DSpace (via Java API, OAI-PMH, Web Services etc) 
>>>> will
>
>>>> be able to use this metadata, though the DSpace search/retrieval
>>>> functions will use the Dublin Core.  This could be treated as an
>>>> interim measure, until DSpace properly supports different schemas.
>>>> *		You could just do the code gruntwork to support
>>>> CDWA in an
>>>> expedient way.  This would make it difficult to fold your changes 
>>>> into
>
>>>> open source DSpace though, so you'd probably end up with a "fork"
>>>> of the code
>>>> you'd have to maintain yourself.
>>>>
>>>> You certainly aren't the only project interested in alternative
>>>> metadata schemas, though, so I do hope it won't be long before a
>>>> concerted effort starts up to support this, and I hope you'd become
>>>> involved in such an effort as it would help you with your 
>>>> objectives!
>>>>
>>>> Does this help?
>>>>  Robert Tansley / Digital Media Systems Programme / HP Labs
>>>>   http://www.hpl.hp.com/personal/Robert_Tansley/
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: NEC IT Guy Games.
> Get your fingers limbered up and give it your best shot. 4 great 
> events, 4
> opportunities to win big! Highest score wins.NEC IT Guy Games. Play to
> win an NEC 61 plasma display. Visit http://www.necitguy.com/?r=20
> _______________________________________________
> DSpace-tech mailing list
> DSpace-tech at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/dspace-tech
>




More information about the Dspace-general mailing list