[Netarchivesuite-users] Issue - harvest running infinitely
csr at kb.dk
Tue May 1 10:29:37 CEST 2018
You can also use the H3 Remote Access section in NetarchiveSuite to
monitor and terminate the harvest. One possible way to diagnose
what is happening:
i) pause heritrix
ii) list the Frontier to see what url's Heritrix is working on
iii) delete any problems from the Frontier
iv) unpause heritrix and let it end finish normally.
Sometimes you may need to terminate heritrix instead.
On 04/30/2018 01:14 PM, Søren Vejrup Carlsen wrote:
> Hi Koit.
> Is it only a specific website, that causes the problem? Or is this a general problem?
> Anyway, you can always log on to the heritrix3 instance and terminate the job manually
> With the following credentials (User: admin , Password: adminPassword )
> Unless you have changed these values.
> Søren Vejrup Carlsen
> IT consultant
> svc at kb.dk<mailto:svc at kb.dk>
> [cid:image002.png at 01D3E085.22979CF0]
> Det Kgl. Bibliotek
> Royal Danish Library
> Søren Kierkegaards Plads 1
> DK-1221 København K
> +45 3347 4747
> CVR 2898 8842
> EAN 5798 000 795297
> From: NetarchiveSuite-users [mailto:netarchivesuite-users-bounces at ml.sbforge.org] On Behalf Of Koit Summatavet
> Sent: Monday, April 30, 2018 11:50 AM
> To: netarchivesuite-users at ml.sbforge.org
> Subject: [Netarchivesuite-users] Issue - harvest running infinitely
> I have started using NAS to harvest Estonian websites and I have encountered a problem:
> In a situation where the harvest doesn't hit either the document not the size limit then the harvest runs infinitely and all the threads are in TIMED_WAITING state where they wait from hours to days. The longer it runs the longer the wait becomes and URL's are processed very slowly and after a long time.
> How to stop this frong happening and changes to make in the harvest template?
> I am using NAS version 5.3.1. Does the same happen on versuon 5.4?
> With regards,
> NetarchiveSuite-users mailing list
> NetarchiveSuite-users at ml.sbforge.org
Colin Rosenthal PhD
Senior IT Consultant
Royal Danish Library (Aarhus)
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the NetarchiveSuite-users