[Netarchivesuite-users] Preparing for and handling planned network outage during broad crawl
sara.aubry at bnf.fr
sara.aubry at bnf.fr
Thu Aug 29 10:42:18 CEST 2024
Hello Peter,
Planned network outage is a lot easier to handle than unexpected ones!
At BnF, we paused all running jobs until we have the confirmation
everything is back to normal. We already did this during broad crawls so
for more than 60 cralwers.
The challenge is to make sure you have no new job to start, so we
temporarily desactivate the Selective harvests.
If we have jobs at a post-processing stage that we cannot paused, we stop
the HarvestController by putting a shutdown.txt file in the job
directory..
Best,
Sara
De : "Tue Hejlskov Larsen" <tlr at kb.dk>
A : "netarchivesuite-users at ml.sbforge.org"
<netarchivesuite-users at ml.sbforge.org>
Date : 29/08/2024 08:42
Objet : Re: [Netarchivesuite-users] Preparing for and handling planned
network outage during broad crawl
Envoyé par : "NetarchiveSuite-users"
<netarchivesuite-users-bounces at ml.sbforge.org>
Hello Peter,
It depends on your max messages queues settings in the Open JMs broker
properties and how long time there is no network. Our installation breaks
down if it loose the network connection more than an 10 – 30
min..dependend on what’a running
If the message queues hIt the maximum there is normally no other way than
restart the whole platform and all running harvesterjobs needs to be
restarted.
You can try to restart the GUI it will try to empty the JMS queues. And
sometimes - after the network is ok again - and you have patience and wait
2-3 hours the broker will tries to resolve the messages queues and
succeded to do it last time.
Best regards
Tue
From: NetarchiveSuite-users <netarchivesuite-users-bounces at ml.sbforge.org>
On Behalf Of Peter Svanberg
Sent: Wednesday, August 28, 2024 6:01 PM
To: netarchivesuite-users at ml.sbforge.org
Subject: [Netarchivesuite-users] Preparing for and handling planned
network outage during broad crawl
Hello!
There will be network service work at our site next weekend (7-8 Aug.). No
figures on outage but they will change hardware so probably many minutes,
maybe hours. Our current broad crawl pass is perhaps not finished then.
How do you minimize the consequences in NAS?
Pause all running jobs before and unpause them after? That would minimize
the effects of the ongoing crawls.
But what will happen with the other processes and connections? Will the
processes have to be restarted? And also the unpaused jobs, when they are
ready, to make them reconnect? (I'm guessing wildly/groping blindly …)
Anyone have experience?
Peter Svanberg
Technical officer
Aquisitions and Metadata Department
Film, Games, Sheet Music and Web Unit
National Library of Sweden
PO Box 5039, SE-102 41 Stockholm
Visits: Karlavägen 96, Stockholm
+46 10-709 32 78
Peter.Svanberg at kb.se
www.kb.se
_______________________________________________
NetarchiveSuite-users mailing list
NetarchiveSuite-users at ml.sbforge.org
https://ml.sbforge.org/mailman/listinfo/netarchivesuite-users
Venez découvrir le le musée de la BnF à Richelieu . Avant d'imprimer, pensez à l'environnement.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://ml.sbforge.org/pipermail/netarchivesuite-users/attachments/20240829/a296f27d/attachment-0001.html>
More information about the NetarchiveSuite-users
mailing list