[Netarchivesuite-users] CRAWL ENDING - Finished - Ended by operator
Kaare Fiedler Christiansen
kfc at statsbiblioteket.dk
Fri Jun 6 09:17:56 CEST 2008
On Fri, 2008-06-06 at 08:57 +0200, Kaare Fiedler Christiansen wrote:
> On Fri, 2008-06-06 at 08:34 +0200, aponb at gmx.at wrote:
> > A configuration, which will be started every four hours, brings
> > sometimes in the crawler log the message
> > "CRAWL ENDING - Finished - Ended by operator"
> > instead of only
> > "CRAWL ENDING - Finished"
> >
> > and in fact in these jobs, there are some pages missing, which should
> > have been crawled.
> >
> > Do you know what's the reason for that behavior?
>
> "Ended by operator" happens when the crawl is requested stopped by the
> system.
>
> This is done when a harvester has been inactive for a long period,
> although there are still URLs in the queue. The amount of time before
> the harvesters are stopped is defined by the two settings:
>
> settings.harvester.harvesting.heritrix.inactivityTimeout
> settings.harvester.harvesting.heritrix.noresponseTimeout
I should make it absolutely clear that this is of course the amount of
time *with no activity* before the harvest is stopped. We don't stop a
harvester that is still active.
Best,
Kåre
More information about the NetarchiveSuite-users
mailing list