[Dspace-general] Fwd: Filter media and deleted items

Louw Venter Louw.Venter at nwu.ac.za
Wed Nov 25 05:31:43 EST 2009


Anyone have any ideas please?
 

Vrywaringsklousule / Disclaimer: http://www.nwu.ac.za/it/gov-man/disclaimer.html 

>>> On 03 November 2009 at 12:40 PM, "Louw Venter" <Louw.Venter at nwu.ac.za> wrote:
Hello all,
 
I made a bit of a mess. 
A while back I uploaded some PDF documents to DSpace and ran Filter media to extract the text. Recently the creators of the pdf files sent me a batch with updated volume numbers etc to replace the existing ones already on the server. So I simply removed the items and added new bitstreams.
Now when I run the filter media process again the text doesn't get extracted - could this be because the checksums don't match or because the original was located in one assetstore and the new one in another?
 
Thank you in advance for any help in this regard,
 
 
ERROR filtering, skipping bitstream:
 
        Item Handle: 10394/1886
        Bundle Name: ORIGINAL
        File Size: 287223
        Checksum: 6de2597a7cabd6ca3a995c355d9301f1 (MD5)
        Asset Store: 1
java.lang.NullPointerException
java.lang.NullPointerException
        at org.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:194)
        at org.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:182)
        at org.pdfbox.pdmodel.PDDocumentCatalog.getAllPages(PDDocumentCatalog.java:226)
        at org.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:216)
        at org.dspace.app.mediafilter.PDFFilter.getDestinationStream(PDFFilter.java:141)
        at org.dspace.app.mediafilter.MediaFilterManager.processBitstream(MediaFilterManager.java:668)
        at org.dspace.app.mediafilter.MediaFilterManager.filterBitstream(MediaFilterManager.java:570)
        at org.dspace.app.mediafilter.MediaFilterManager.filterItem(MediaFilterManager.java:520)
        at org.dspace.app.mediafilter.MediaFilterManager.applyFiltersItem(MediaFilterManager.java:488)
        at org.dspace.app.mediafilter.MediaFilterManager.applyFiltersAllItems(MediaFilterManager.java:427)
        at org.dspace.app.mediafilter.MediaFilterManager.main(MediaFilterManager.java:359)
 
 
Louw Venter
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/dspace-general/attachments/20091125/f7e563ed/attachment.htm


More information about the Dspace-general mailing list