[Netarchivesuite-users] Issue - harvest running infinitely

Søren Vejrup Carlsen svc at kb.dk
Mon Apr 30 13:14:15 CEST 2018


Hi Koit.
Is it only a specific website, that causes the problem? Or is this a general problem?

Anyway, you can always log on to the heritrix3 instance and terminate the job manually
With the following credentials (User: admin , Password: adminPassword )
Unless you have changed these values.


Søren Vejrup Carlsen
IT-konsulent
IT consultant

IT-Udvikling.København
ITUI

+4591324841
svc at kb.dk<mailto:svc at kb.dk>


[cid:image002.png at 01D3E085.22979CF0]

Det Kgl. Bibliotek
Royal Danish Library

Søren Kierkegaards Plads 1
DK-1221 København K
+45 3347 4747

CVR 2898 8842
EAN 5798 000 795297

From: NetarchiveSuite-users [mailto:netarchivesuite-users-bounces at ml.sbforge.org] On Behalf Of Koit Summatavet
Sent: Monday, April 30, 2018 11:50 AM
To: netarchivesuite-users at ml.sbforge.org
Subject: [Netarchivesuite-users] Issue - harvest running infinitely

Hi,

I have started using NAS to harvest Estonian websites and I have encountered a problem:

In a situation where the harvest doesn't hit either the document not the size limit then the harvest runs infinitely and all the threads are in TIMED_WAITING state where they wait from hours to days. The longer it runs the longer the wait becomes and URL's are processed very slowly and after a long time.

How to stop this frong happening and changes to make in the harvest template?

I am using NAS version 5.3.1. Does the same happen on versuon 5.4?

With regards,
Koit
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://ml.sbforge.org/pipermail/netarchivesuite-users/attachments/20180430/4919c704/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.png
Type: image/png
Size: 6924 bytes
Desc: image002.png
URL: <https://ml.sbforge.org/pipermail/netarchivesuite-users/attachments/20180430/4919c704/attachment.png>


More information about the NetarchiveSuite-users mailing list