[Dspace-general] Simple use case of DSpace -- can this work?

Sergio Trejo sergtrejo at gmail.com
Wed May 3 12:04:12 EDT 2006


Dear Don,

Thank you very very much. I hadn't picked up on the Item Importer when
perusing through the documentation previously and after a quick scan of it
along with your helpful email, I think that will get me started on the right
track.

As for the kind of item that I want to import, it doesn't have a name yet
but for now let's call it a "bundle of web page resources" or a "bundle" for
short. The files comprising this "bundle" are a logically related collection
of files (images such as in PNG format, unstructured UTF-8 encoded text
files, XHTML files and CSS style sheet files, etc.). When I looked at that
very nice diagram (the PDF depicting the DSpace system), the diagram said
verbatim:

An item is an "archival atom" consisting of grouped, related content and
associated descriptions (metadata).

Unless I have misinterpreted, the DSpace definition of an item seems
appropriate to my "bundle" situation. It does look as if, however, that
Dublin Core will be required for each file in this "bundle" item. I was
hoping that I could use one DC metadata file to describe the "bundle" as a
whole and not necessarily worry about the metadata description of each file
or part of the "bundle" but I can see how that would be too limiting and the
option needs to exist to describe each and every file in an item if need be.

Thanks again for pointing me in the right direction. Most likely the
Importer will be fine to start with and I'll have to do some additional
investigating per your suggestions such as with regard to handling the item
structure.

Cheers,

-Sergio

On 5/3/06, Don Gourley <gourley at wrlc.org> wrote:
>
> Sergio,
>
> I don't quite understand what kind of item you want to import into
> DSpace...and if it is just one big item I'm not sure what value
> DSpace would offer to managing it.  But in general for importing
> items you would use the ItemImport program:
>
> http://www.dspace.org/technology/system-docs/application.html#itemimporter
>
> This program reads the DSpace "simple archive format" which is a
> directory structure with folders for each item which contain a
> very simple XML encoding of Dublin Core, the content files and
> a file listing the content files.  I have written Perl scripts
> to create this directory structure and it is pretty easy.
>
> I think the content files can be structured in sub-folders but
> I've never tried that and don't know how (or if) that structure
> would be translated into DSpace's item structure.  The way I've
> dealt with structural relationships between files in an item is
> by including another file in each item which includes structural
> metadata.  Another option in your case might be to use DSpace's
> community/sub-community/collection/item hierarchy to map your
> directories and files to multiple items instead of a single one.
>
> In 1.4 you have another option which is to create a packager
> plugin to ingest your item into DSpace.  However, the plugin
> must be written in Java...I don't think there is any easy way
> to use a scripting language.
>
> -Don
>
> On Wed, May 3, 2006 9:26 am, Sergio Trejo wrote:
> > Hello All,
> >
> > I am about to install DSpace 1.4 alpha. I will gladly test it out and be
> > happy to provide feedback to the maintainers. I had started to look at
> > DSpace last year but was called to do work on a different project. Now I
> am
> > returning to DSpace and I am looking forward to it!.
> >
> > I have a simple use case:
> >
> > * I have, on the file system of the server which I plan to install
> DSpace
> > (Mac OS X Server), a top-level directory. This top-level directory
> contains
> > files, sub-directories, and a few symbolic links (the links are to other
> > directories within the top-level directory). The files contained in this
> > directory structure on the file system are comprised mostly of
> web-related
> > content (images in JPG and PNG), text, CSS, XHTML, etc. I also have one
> and
> > only one RDF file for the entire top-level directory which contains
> Simple
> > Dublin Core (15 elements maximum) that describe the entire directory of
> the
> > content I just mentioned (DC: author, date, identifier, publisher,
> etc.).
> >
> > * I want to turn the above-described directory (and all of its content
> and
> > RDF metdata file and sub-directories) as a DSpace "item" (a DSpace
> archival
> > atom) as per the gorgeous diagram found at
> > http://www.dspace.org/introduction/dspace-diagram.pdf
> >
> > * I would like to write a shell script that may be run on the Mac OS X
> > Server machine that is also hosting the DSpace 1.4 alpha system, which
> > script would be run by a designated Collection Curator and used to
> > importthe above-mentioned DSpace item. I would thus like to avoid or
> > highly
> > minimize the requirement for a person (curator) to use the DSpace Web
> > Interface and to avoid the need to fill out web forms for manually
> entering
> > metadata about the "item". The motto I must take in my small and lean
> > organization is borrowed from the Ruby on Rails community which espouses
> > simplicity and agile approaches: DRY (Don't Repeat Yourself) <
> > http://wiki.rubyonrails.com/rails/pages/DRY >
> >
> > Looking at the DSpace documentation, it is my understanding that in
> order to
> > import an "item" into a DSpace repository, I will need to somehow create
> a
> > SIP (Submission Information Package) file. A SIP apparently is "an XML
> > metadata document with some content files" but I am having a hard time
> > finding detailed documentation on how to create a SIP and just what goes
> > into this "XML metadata document" as well as what "content files" are
> > required.
> >
> > Could my proposed shell shell script, for example, parse the Simple
> Dublin
> > Core contained in the RDF document that both describes the and is a part
> of
> > the item, to generate a machine-meaningful SIP? How complex of a process
> > might this be, to create a SIP? Will I need more than Simple Dublin Core
> to
> > achieve all of this? Has anyone done something similar? My goal is to
> try
> > and keep things as easy on people as possible. It is my job to make
> other
> > people's lives as easy as possible ... I am fluent in scripting
> languages
> > (python works great as does ruby) and am looking forward to creating
> SIPs
> > for items.
> >
> > Thank you for any suggestions.
> >
> > -Sergio
> > _______________________________________________
> > Dspace-general mailing list
> > Dspace-general at mit.edu
> > http://mailman.mit.edu/mailman/listinfo/dspace-general
> >
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/dspace-general/attachments/20060503/55da263b/attachment.htm


More information about the Dspace-general mailing list