[Netarchivesuite-users] Oldjobs directory growing too big

Søren Vejrup Carlsen svc at kb.dk
Wed Apr 29 13:19:47 CEST 2009

Hi Nicolas.

Have you by any chance reset (emptied) the JMS queues used by your NAS installation?

If any Heritrix harvests fails, our code should send a message back to the scheduler, that this job has failed.

But if you have emptied   the JMS queues used by your NAS installation, this message may have been lost.




Fra: netarchivesuite-users-bounces at lists.gforge.statsbiblioteket.dk [mailto:netarchivesuite-users-bounces at lists.gforge.statsbiblioteket.dk] På vegne af nicolas.giraud at bnf.fr
Sendt: 29. april 2009 11:56
Til: netarchivesuite-users at lists.gforge.statsbiblioteket.dk
Emne: Re: [Netarchivesuite-users] Oldjobs directory growing too big


Hi Bjarne,

Thanks for the info. By "The recovering of jobs not reported as FINISHED is still an all manual process here" do you mean that there is a manual procedure to restart a job that is stored in the oldjobs directory?

I have a job that appears as "started"  in the Harvest status section, but I had 2 errors during the crawl : loss of JMX connection to Heritrix (I don't understand what causes this now), so the job got moved to oldjobs. Then the disk was saturated. I moved the oldjob dirs to NFS mounts to solve the disk space problem. But now after restarting NAS, the job still shows as "started" but it does not restart, no Heritrix is instanced. I'm a bit lost there.


Avant d'imprimer, pensez à l'environnement.
Consider the environment before printing this mail.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://ml.sbforge.org/pipermail/netarchivesuite-users/attachments/20090429/1edce966/attachment-0002.html>

More information about the NetarchiveSuite-users mailing list