[Dspace-general] Fwd: Filter media and deleted items
Louw Venter
Louw.Venter at nwu.ac.za
Wed Nov 25 05:31:43 EST 2009
Anyone have any ideas please?
Vrywaringsklousule / Disclaimer: http://www.nwu.ac.za/it/gov-man/disclaimer.html
>>> On 03 November 2009 at 12:40 PM, "Louw Venter" <Louw.Venter at nwu.ac.za> wrote:
Hello all,
I made a bit of a mess.
A while back I uploaded some PDF documents to DSpace and ran Filter media to extract the text. Recently the creators of the pdf files sent me a batch with updated volume numbers etc to replace the existing ones already on the server. So I simply removed the items and added new bitstreams.
Now when I run the filter media process again the text doesn't get extracted - could this be because the checksums don't match or because the original was located in one assetstore and the new one in another?
Thank you in advance for any help in this regard,
ERROR filtering, skipping bitstream:
Item Handle: 10394/1886
Bundle Name: ORIGINAL
File Size: 287223
Checksum: 6de2597a7cabd6ca3a995c355d9301f1 (MD5)
Asset Store: 1
java.lang.NullPointerException
java.lang.NullPointerException
at org.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:194)
at org.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:182)
at org.pdfbox.pdmodel.PDDocumentCatalog.getAllPages(PDDocumentCatalog.java:226)
at org.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:216)
at org.dspace.app.mediafilter.PDFFilter.getDestinationStream(PDFFilter.java:141)
at org.dspace.app.mediafilter.MediaFilterManager.processBitstream(MediaFilterManager.java:668)
at org.dspace.app.mediafilter.MediaFilterManager.filterBitstream(MediaFilterManager.java:570)
at org.dspace.app.mediafilter.MediaFilterManager.filterItem(MediaFilterManager.java:520)
at org.dspace.app.mediafilter.MediaFilterManager.applyFiltersItem(MediaFilterManager.java:488)
at org.dspace.app.mediafilter.MediaFilterManager.applyFiltersAllItems(MediaFilterManager.java:427)
at org.dspace.app.mediafilter.MediaFilterManager.main(MediaFilterManager.java:359)
Louw Venter
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.mit.edu/pipermail/dspace-general/attachments/20091125/f7e563ed/attachment.htm
More information about the Dspace-general
mailing list