[Netarchivesuite-users] Question about a problem with NAS QA Viewer

Navarro Guillén, Soledad soledad.navarro at bne.es
Mon Dec 21 12:53:06 CET 2015


Dear all,

In the National Library of Spain Web Archive we have recently changed from NAS 4.2 to NAS 4.4 and we have a problem with the NAS QA Viewer.

Using compression in NAS 4.4 templates (changing only what is highlighted, only in the section of WARC, not of ARC) the NAS QA viewer does not work. The files generated in the harvest are the type warc.gz


        <newObject name="WARCArchiver#decide-rules" class="org.archive.crawler.deciderules.DecideRuleSequence">
          <map name="rules">
          </map>
        </newObject>
        <boolean name="compress">true</boolean>

This is the error that appears in the graphic interface:

[cid:image001.gif at 01D138D8.B81FFAD0]


And this is the error that appears in the logs:

DETALLADO: Caught exception while running batch job on file /netarchive/WARC_Archive/filedir/5-metadata-1.warc, position 4232857:
null
java.lang.NullPointerException
at java.util.regex.Matcher.getTextLength(Matcher.java:1234)
at java.util.regex.Matcher.reset(Matcher.java:308)
at java.util.regex.Matcher.<init>(Matcher.java:228)
at java.util.regex.Pattern.matcher(Pattern.java:1088)
at dk.netarkivet.harvester.indexserver.GetMetadataArchiveBatchJob.processRecord(GetMetadataArchiveBatchJob.java:95)
at dk.netarkivet.common.utils.archive.ArchiveBatchJob.processFile(ArchiveBatchJob.java:124)
at dk.netarkivet.common.utils.batch.BatchLocalFiles.processFile(BatchLocalFiles.java:168)
at dk.netarkivet.common.utils.batch.BatchLocalFiles.run(BatchLocalFiles.java:115)
at dk.netarkivet.archive.bitarchive.Bitarchive.batch(Bitarchive.java:246)
at dk.netarkivet.archive.bitarchive.distribute.BitarchiveServer$1.run(BitarchiveServer.java:428)

dic 14, 2015 11:04:50 AM dk.netarkivet.archive.bitarchive.Bitarchive batch
DETALLADO: Batch: Job dk.netarkivet.harvester.indexserver.GetMetadataArchiveBatchJob, with arguments: URLMatcher = metadata://[^/]*/crawl/index/cdx.*, mimeMatcher = application/x-cdx finished at Mon Dec 14 11:04:50 CET 2015
dic 14, 2015 11:04:50 AM dk.netarkivet.archive.bitarchive.Bitarchive batch
INFORMACIÓN: Finished batch job on bitarchive application with id '192.168.81.37_BitApp_2': 'dk.netarkivet.harvester.indexserver.GetMetadataArchiveBatchJob', on filename-pattern: '5-metadata-[0-9]+.(w)?arc' + with result: 1 failures in processing 1 files at 192.168.81.37_BitApp_2

Do you know if there is a way to solve it?

Thank you very much and happy Christmas,


Soledad Navarro
Área de Gestión del Depósito de las Publicaciones en Línea
Biblioteca Nacional de España
Paseo de Recoletos, 20-22. Madrid 28001
Tlf: (0034)91 516 81 18 - Ext. 218
Fax: (0034) 915168102


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://ml.sbforge.org/pipermail/netarchivesuite-users/attachments/20151221/88ba1fae/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.gif
Type: image/gif
Size: 5994 bytes
Desc: image001.gif
URL: <http://ml.sbforge.org/pipermail/netarchivesuite-users/attachments/20151221/88ba1fae/attachment.gif>


More information about the NetarchiveSuite-users mailing list