[Netarchivesuite-users] Jobs resubmitteb when restating GUIApplication: bug?

Kaare Fiedler Christiansen kfc at statsbiblioteket.dk
Mon May 25 13:29:48 CEST 2009


On Mon, 2009-05-25 at 12:44 +0200, nicolas.giraud at bnf.fr wrote:
> 
> Hi,
> 
> I have observed twice the following behavior:
> 
> 1) I have a number of jobs in the "submitted" status, and two jobs in
> the "started" status (for insta,nce job IDs 1 and 2 are "started", ids
> 3 to 10 are "submitted")
> 2) I stop and then start again the GUIApplication on the admin machine
> 
> The following results are then observed:
> 
> 1) All the jobs that were in the "started" and submitted are now
> "resubmitted" and resubmission jobs are in the the "sibmitted" status
> (e.g. jobs 11-21 have been created, job 1 being resubmitted as job 11,
> and so on)
> 2) The initial jobs (1 to 10) are running on the crawlers, but the
> harvest status page shows now "started" job
> 
> This happens consistently when restarting the GUIApplication,
> needlessly multiplicating the jobs. This behavior seems incorrect to
> me, can you please confirm that this is a bug?

If it isn't described in the documentation it is certainly a bug in
there :-)

Currently we only support restarting the GUI application if the whole
system is restarted, and the JMS broker queues are flushed. The
behaviour you see is what happens if you do NOT flush the JMS queues.

The reason for this is that we currently have no established way of
getting the scheduler to find out what the current state is, if it is
restarted. It doesn't know if jobs are still in the JMS queue or if they
were lost. It doesn't know if jobs on the harvesters are stilling
running, or if they were lost.

Thus we made it the rule that you have to flush the JMS queues when you
restart the GUIApplication. This should probably be emphsized in the
documentation. After restart, the GUI will resubmit all jobs that were
in status SUBMITTED - but it shouldn't be resubmit the ones that are in
status STARTED, since the harvesters send back information on those
whether the jobs end succesfully or not.

I think the logic around this could probably be improved, but it is
functional if you clear the broker-queues on restart.

Best,
  Kåre
-- 
Kaare Fiedler Christiansen <kfc at statsbiblioteket.dk>




More information about the NetarchiveSuite-users mailing list