[Netarchivesuite-users] Problem with a spider

sara.aubry at bnf.fr sara.aubry at bnf.fr
Mon Sep 20 10:22:11 CEST 2021


Hi Miguel,

The limit of opened files you have defined on your system is probably too 
low.
See https://linuxcommand.org/lc3_man_pages/ulimith.html

At BnF, we set it to unlimited for the harvest controllers (and never had 
any issue).
But you could at least double the number you have set.

Sara 





De :    "Soleto Ruiz de Clavijo, Miguel" <miguel.soleto at externos.bne.es>
A :     "'netarchivesuite-users at ml.sbforge.org'" 
<netarchivesuite-users at ml.sbforge.org>
Cc :    "García Arratia, Juan Carlos" <juancarlos.garcia at bne.es>, "Monzón, 
Fernando" <f.monzon at bne.es>
Date :  20/09/2021 09:42
Objet : [Netarchivesuite-users] Problem with a spider
Envoyé par :    "NetarchiveSuite-users" 
<netarchivesuite-users-bounces at ml.sbforge.org>



Hi!
I’m Miguel Soleto, from the National Library of Spain. We have had a 
problem this weekend with a spider. Although the spider was working, We 
can’t see anything on the interface. Here is what I’ve seen on the log 
(heritrix3_err.log):
 
2021-09-20 07:02:37.245 INFO thread-21 
org.archive.crawler.reporting.StatisticsTracker.writeReportFile() wrote 
report: 
/netarchive/BNE/harvester_high/71215_1632093369236/heritrix3/./jobs/71215_1632093369236/20210919231620/reports/processors-report.txt
2021-09-20 07:02:37.245 GRAVE thread-21 
org.archive.crawler.reporting.StatisticsTracker.writeReportFile() Unable 
to write 
/netarchive/BNE/harvester_high/71215_1632093369236/heritrix3/./jobs/71215_1632093369236/20210919231620/reports/frontier-summary-report.txt 
at the end of crawl.
java.io.FileNotFoundException: 
/netarchive/BNE/harvester_high/71215_1632093369236/heritrix3/./jobs/71215_1632093369236/20210919231620/reports/frontier-summary-report.txt 
(Demasiados ficheros abiertos)
               at java.io.FileOutputStream.open0(Native Method)
               at java.io.FileOutputStream.open(FileOutputStream.java:270)
               at 
java.io.FileOutputStream.<init>(FileOutputStream.java:213)
               at 
java.io.FileOutputStream.<init>(FileOutputStream.java:162)
               at java.io.FileWriter.<init>(FileWriter.java:90)
               at 
org.archive.crawler.reporting.StatisticsTracker.writeReportFile(StatisticsTracker.java:897)
               at 
org.archive.crawler.reporting.StatisticsTracker.dumpReports(StatisticsTracker.java:926)
               at 
org.archive.crawler.reporting.StatisticsTracker.stop(StatisticsTracker.java:342)
               at 
org.springframework.context.support.DefaultLifecycleProcessor.doStop(DefaultLifecycleProcessor.java:236)
               at 
org.springframework.context.support.DefaultLifecycleProcessor.doStop(DefaultLifecycleProcessor.java:213)
               at 
org.springframework.context.support.DefaultLifecycleProcessor.doStop(DefaultLifecycleProcessor.java:213)
               at 
org.springframework.context.support.DefaultLifecycleProcessor.doStop(DefaultLifecycleProcessor.java:213)
 
In red “Demasiados ficheros abiertos”, which means “Too much open files”. 
Does anyone have had a problem like this? Is there a way to avoid this?
 
Thank you all!
 
Best Regards.
Este mensaje y cualquier fichero adjunto están dirigidos únicamente a sus 
destinatarios y contiene información confidencial. Si usted ha recibido 
este correo electrónico por error, le informamos que no puede realizar 
ninguna revisión, alteración, impresión, copia, transmisión, difusión ni 
utilización alguna de este mensaje ni de cualquier fichero adjunto que 
pudiese contener. La realización de cualquiera de los actos indicados está 
expresamente prohibida por las Normas que regulan estas materias. Por todo 
ello se solicita que, en caso de existir error en la recepción de este 
mensaje, se lo notifique al remitente respondiendo a este e-mail y elimine 
el mensaje y su contenido inmediatamente. La Biblioteca Nacional de España 
se reserva las acciones legales que le correspondan en el caso de que se 
infrinja lo indicado anteriormente. The information in this e-mail and any 
attachments is confidential and it is intended for the addressee only. If 
you have received this e-mail in error, you are notified that any 
revision, amendment, print, copy, disclosure, distribution or use of the 
contents is unauthorized. Carrying out any of the above actions, is 
expressly banned by rules governing this matter. Hence we request that if 
you are not the intended recipient, please notify the sender answering 
this e-mail, and delete the message and any attachments. The National 
Library of Spain reserves itself the right to take the appropriate legal 
actions in the event of the above mentioned matter is being infringed. 
_______________________________________________
NetarchiveSuite-users mailing list
NetarchiveSuite-users at ml.sbforge.org
https://ml.sbforge.org/mailman/listinfo/netarchivesuite-users


Découvrez toute la programmation culturelle de la rentrée à la BnF 
Pass BnF lecture/culture  : bibliothèques, expositions, conférences, concerts en illimité pour 15 € / an  –  Acheter en ligne Avant d'imprimer, pensez à l'environnement. 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://ml.sbforge.org/pipermail/netarchivesuite-users/attachments/20210920/fd1aca04/attachment-0001.html>


More information about the NetarchiveSuite-users mailing list