<span style=" font-size:10pt;font-family:sans-serif">Hello Miguel,</span><br><br><span style=" font-size:10pt;font-family:sans-serif">It seems that
you tried to add every single URL that caused a 404 as a crawler trap.
That's way too much. Use regular expressions instead.</span><br><br><span style=" font-size:10pt;font-family:sans-serif">Best regards,</span><br><span style=" font-size:10pt;font-family:sans-serif"> Bert</span><br><span style=" font-size:10pt;font-family:sans-serif">-- <br>Ingénieur de production pour l'archivage de l'internet<br>Département des systèmes d'information<br>Bibliothèque nationale de France<br>Quai François-Mauriac<br>75706 Paris Cedex 13<br>Tél. : 01 53 79 45 58</span><br><br><br><br><br><span style=" font-size:9pt;color:#5f5f5f;font-family:sans-serif">De
: </span><span style=" font-size:9pt;font-family:sans-serif">"Soleto
Ruiz de Clavijo, Miguel" <miguel.soleto@externos.bne.es></span><br><span style=" font-size:9pt;color:#5f5f5f;font-family:sans-serif">A
: </span><span style=" font-size:9pt;font-family:sans-serif">"netarchivesuite-users@ml.sbforge.org"
<netarchivesuite-users@ml.sbforge.org></span><br><span style=" font-size:9pt;color:#5f5f5f;font-family:sans-serif">Date
: </span><span style=" font-size:9pt;font-family:sans-serif">12/08/2025
09:18</span><br><span style=" font-size:9pt;color:#5f5f5f;font-family:sans-serif">Objet
: </span><span style=" font-size:9pt;font-family:sans-serif">[Netarchivesuite-users]
About the crawler traps</span><br><span style=" font-size:9pt;color:#5f5f5f;font-family:sans-serif">Envoyé
par : </span><span style=" font-size:9pt;font-family:sans-serif">"NetarchiveSuite-users"
<netarchivesuite-users-bounces@ml.sbforge.org></span><br><hr noshade><br><br><p style="margin-top:0px;margin-Bottom:0px"><span style=" font-size:12pt;font-family:Calibri">Dear
all,</span></p><p style="margin-top:0px;margin-Bottom:0px"><span style=" font-size:12pt;font-family:Calibri"> </span></p><p style="margin-top:0px;margin-Bottom:0px"><span style=" font-size:12pt;font-family:Calibri">I
have a question about traps. We have identified thousands of 404 codes
in our crawls and want to add them as traps in the harvest. However, there
are over 23,000 of them, and when I try to save them, I get a 502 error.</span></p><p style="margin-top:0px;margin-Bottom:0px"><span style=" font-size:12pt;font-family:Calibri">Is
there any way to add all these traps?</span></p><p style="margin-top:0px;margin-Bottom:0px"><span style=" font-size:12pt;font-family:Calibri"> </span></p><p style="margin-top:0px;margin-Bottom:0px"><span style=" font-size:12pt;font-family:Calibri">Thank
you very much in advance for your help.</span></p><p style="margin-top:0px;margin-Bottom:0px"><span style=" font-size:12pt;font-family:Calibri"> </span></p><p style="margin-top:0px;margin-Bottom:0px"><span style=" font-size:12pt;font-family:Calibri">Best
regards,</span></p><p style="margin-top:0px;margin-Bottom:0px"><span style=" font-size:12pt;font-family:Calibri"> </span></p><p style="margin-top:0px;margin-Bottom:0px"><span style=" font-size:12pt;font-family:Calibri">Miguel.</span></p><p style="margin-top:0px;margin-Bottom:0px"><span style=" font-size:12pt;font-family:Calibri"> </span></p><br><hr><span style=" font-size:8pt">Este mensaje y cualquier fichero adjunto
están dirigidos únicamente a sus destinatarios y contiene información
confidencial. Si usted ha recibido este correo electrónico por error,
le informamos que no puede realizar ninguna revisión, alteración, impresión,
copia, transmisión, difusión ni utilización alguna de este mensaje ni
de cualquier fichero adjunto que pudiese contener. La realización de cualquiera
de los actos indicados está expresamente prohibida por las Normas que
regulan estas materias. Por todo ello se solicita que, en caso de existir
error en la recepción de este mensaje, se lo notifique al remitente respondiendo
a este e-mail y elimine el mensaje y su contenido inmediatamente. La Biblioteca
Nacional de España se reserva las acciones legales que le correspondan
en el caso de que se infrinja lo indicado anteriormente.</span><span style=" font-size:12pt"></span><hr><span style=" font-size:8pt">The information in this e-mail and any
attachments is confidential and it is intended for the addressee only.
If you have received this e-mail in error, you are notified that any revision,
amendment, print, copy, disclosure, distribution or use of the contents
is unauthorized. Carrying out any of the above actions, is expressly banned
by rules governing this matter. Hence we request that if you are not the
intended recipient, please notify the sender answering this e-mail, and
delete the message and any attachments. The National Library of Spain reserves
itself the right to take the appropriate legal actions in the event of
the above mentioned matter is being infringed.</span><span style=" font-size:12pt">[pièce jointe "attrxbop.txt" supprimée par Bert WENDLAND/ETS/BnF]
</span><hr><br><br><font face="sans-serif"><hr />
<p>Venez découvrir le <strong><a href="https://www.bnf.fr/fr/le-musee-de-la-bnf">le musée de la BnF à Richelieu</a></strong>.</p>
<p style="color:#008000"><strong>Avant d'imprimer, pensez à l'environnement.</strong></p></font>