<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 14 (filtered medium)">
<!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]--><style><!--
/* Font Definitions */
@font-face
{font-family:Batang;
panose-1:2 3 6 0 0 1 1 1 1 1;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
{font-family:Tahoma;
panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
{font-family:"Calibri Light";
panose-1:2 15 3 2 2 2 4 3 2 4;}
@font-face
{font-family:"\@Batang";
panose-1:2 3 6 0 0 1 1 1 1 1;}
@font-face
{font-family:Aharoni;
panose-1:2 1 8 3 2 1 4 3 2 3;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p
{mso-style-priority:99;
mso-margin-top-alt:auto;
margin-right:0cm;
mso-margin-bottom-alt:auto;
margin-left:0cm;
font-size:12.0pt;
font-family:"Times New Roman","serif";}
tt
{mso-style-priority:99;
font-family:"Courier New";}
span.EstiloCorreo21
{mso-style-type:personal-reply;
font-family:"Calibri","sans-serif";
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-family:"Calibri","sans-serif";
mso-fareast-language:EN-US;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:70.85pt 3.0cm 70.85pt 3.0cm;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="ES" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Thank you, Sara! We will check.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D">Best,<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri Light","sans-serif";color:gray">Alicia Pastrana García<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri Light","sans-serif";color:gray">Área de Gestión del Depósito de las Publicaciones en Línea<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri Light","sans-serif";color:gray">División de Procesos y Servicios Digitales<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri Light","sans-serif";color:gray">Tfno.: 91 516 89 92<o:p></o:p></span></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri Light","sans-serif";color:gray">Biblioteca Nacional de España<o:p></o:p></span></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">De:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> sara.aubry@bnf.fr [mailto:sara.aubry@bnf.fr]
<br>
<b>Enviado el:</b> miércoles, 03 de julio de 2019 10:32<br>
<b>Para:</b> Pastrana García, Alicia<br>
<b>CC:</b> bert.wendland@bnf.fr; "clara.wiatrowski@bnf.fr"@bnf.fr; netarchivesuite-curator@ml.sbforge.org<br>
<b>Asunto:</b> Re: [Netarchivesuite-curator] BnF NAS update for July<o:p></o:p></span></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal"><span style="font-size:10.0pt;font-family:"Arial","sans-serif"">Hello Alicia,</span><br>
<br>
<span style="font-size:10.0pt;font-family:"Arial","sans-serif"">You have to check which JobGenerator system you are using in NAS settings.</span><br>
<span style="font-size:10.0pt;font-family:"Arial","sans-serif"">At BnF, we use the FixedDomainConfigurationCountJobGenerator which gives us the possibility to
</span><br>
<span style="font-size:10.0pt;font-family:"Arial","sans-serif"">put a defined number of domains per job in snapshot harvests and another defined number of domains per job in focused crawls.</span><br>
<br>
<span style="font-size:10.0pt;font-family:"Arial","sans-serif"">Best,</span><br>
<br>
<span style="font-size:10.0pt;font-family:"Arial","sans-serif"">Sara</span><br>
<br>
<span style="font-size:10.0pt;font-family:"Arial","sans-serif""> <scheduler></span><br>
<span style="font-size:10.0pt;font-family:"Arial","sans-serif""> <limitSubmittedJobsInQueue>true</limitSubmittedJobsInQueue></span><br>
<span style="font-size:10.0pt;font-family:"Arial","sans-serif""> <submittedJobsInQueueLimit>1</submittedJobsInQueueLimit></span><br>
<span style="font-size:10.0pt;font-family:"Arial","sans-serif""> <jobtimeouttime>31536000</jobtimeouttime></span><br>
<span style="font-size:10.0pt;font-family:"Arial","sans-serif""> <jobgenerationperiod>60</jobgenerationperiod></span><br>
<span style="font-size:10.0pt;font-family:"Arial","sans-serif""> <jobGen></span><br>
<span style="font-size:10.0pt;font-family:"Arial","sans-serif""> <class><b>dk.netarkivet.harvester.scheduler.jobgen.FixedDomainConfigurationCountJobGenerator</b></class></span><br>
<span style="font-size:10.0pt;font-family:"Arial","sans-serif""> <objectLimitIsSetByQuotaEnforcer>false</objectLimitIsSetByQuotaEnforcer></span><br>
<span style="font-size:10.0pt;font-family:"Arial","sans-serif""> <domainConfigSubsetSize>5000</domainConfigSubsetSize></span><br>
<span style="font-size:10.0pt;font-family:"Arial","sans-serif""> <config></span><br>
<span style="font-size:10.0pt;font-family:"Arial","sans-serif""> <b> <fixedDomainCountSnapshot>5000</fixedDomainCountSnapshot></b></span><br>
<b><span style="font-size:10.0pt;font-family:"Arial","sans-serif""> <fixedDomainCountFocused>500</fixedDomainCountFocused></span></b><br>
<span style="font-size:10.0pt;font-family:"Arial","sans-serif""> <excludeDomainsWithZeroBudget>true</excludeDomainsWithZeroBudget></span><br>
<span style="font-size:10.0pt;font-family:"Arial","sans-serif""> <postponeUnregisteredChannel>false</postponeUnregisteredChannel></span><br>
<span style="font-size:10.0pt;font-family:"Arial","sans-serif""> </config></span><br>
<span style="font-size:10.0pt;font-family:"Arial","sans-serif""> </jobGen></span><br>
<span style="font-size:10.0pt;font-family:"Arial","sans-serif""> </scheduler></span><br>
<br>
<br>
<br>
<br>
<br>
<span style="font-size:7.5pt;font-family:"Arial","sans-serif";color:#5F5F5F">De : </span><span style="font-size:7.5pt;font-family:"Arial","sans-serif"">"Pastrana García, Alicia" <<a href="mailto:alicia.pastrana@bne.es">alicia.pastrana@bne.es</a>></span><br>
<span style="font-size:7.5pt;font-family:"Arial","sans-serif";color:#5F5F5F">A : </span><span style="font-size:7.5pt;font-family:"Arial","sans-serif"">"<a href="mailto:geraldine.camile@bnf.fr">geraldine.camile@bnf.fr</a>" <<a href="mailto:geraldine.camile@bnf.fr">geraldine.camile@bnf.fr</a>>,
"<a href="mailto:netarchivesuite-curator@ml.sbforge.org">netarchivesuite-curator@ml.sbforge.org</a>" <<a href="mailto:netarchivesuite-curator@ml.sbforge.org">netarchivesuite-curator@ml.sbforge.org</a>></span><br>
<span style="font-size:7.5pt;font-family:"Arial","sans-serif";color:#5F5F5F">Cc : </span><span style="font-size:7.5pt;font-family:"Arial","sans-serif"">"<a href="mailto:bert.wendland@bnf.fr">bert.wendland@bnf.fr</a>" <<a href="mailto:bert.wendland@bnf.fr">bert.wendland@bnf.fr</a>>,
"<a href="mailto:clara.wiatrowski@bnf.fr">clara.wiatrowski@bnf.fr</a>" <<a href="mailto:clara.wiatrowski@bnf.fr">clara.wiatrowski@bnf.fr</a>>, "<a href="mailto:DDL_DLN@bnf.fr">DDL_DLN@bnf.fr</a>" <<a href="mailto:DDL_DLN@bnf.fr">DDL_DLN@bnf.fr</a>>, "<a href="mailto:leslie.bellony-ext@bnf.fr">leslie.bellony-ext@bnf.fr</a>"
<<a href="mailto:leslie.bellony-ext@bnf.fr">leslie.bellony-ext@bnf.fr</a>></span><br>
<span style="font-size:7.5pt;font-family:"Arial","sans-serif";color:#5F5F5F">Date : </span><span style="font-size:7.5pt;font-family:"Arial","sans-serif"">02/07/2019 13:33</span><br>
<span style="font-size:7.5pt;font-family:"Arial","sans-serif";color:#5F5F5F">Objet : </span><span style="font-size:7.5pt;font-family:"Arial","sans-serif"">Re: [Netarchivesuite-curator] BnF NAS update for July</span><br>
<span style="font-size:7.5pt;font-family:"Arial","sans-serif";color:#5F5F5F">Envoyé par : </span><span style="font-size:7.5pt;font-family:"Arial","sans-serif"">"Netarchivesuite-curator" <<a href="mailto:netarchivesuite-curator-bounces@ml.sbforge.org">netarchivesuite-curator-bounces@ml.sbforge.org</a>></span><o:p></o:p></p>
<div class="MsoNormal" align="center" style="text-align:center">
<hr size="2" width="100%" noshade="" style="color:#A0A0A0" align="center">
</div>
<p class="MsoNormal"><br>
<br>
<br>
<span style="font-family:"Calibri","sans-serif";color:#004080">Hello all,</span><br>
<span style="font-family:"Calibri","sans-serif";color:#004080"> </span><br>
<span style="font-family:"Calibri","sans-serif";color:#004080">Here is our update:</span><br>
<span style="font-family:"Calibri","sans-serif";color:#004080"> </span><br>
<span style="font-family:"Calibri","sans-serif";color:#004080">We are working in our three event crawl: European Parliament elections, local elections and Spanish Government elections that are still running. We have had a great collaboration from the different
regions in the local elections, and we nominate over 3.700 sites.</span><br>
<span style="font-family:"Calibri","sans-serif";color:#004080"> </span><br>
<span style="font-family:"Calibri","sans-serif";color:#004080">We still don’t know when we are going to launch our broad crawl but it will probably be in September</span><br>
<span style="font-family:"Calibri","sans-serif";color:#004080"> </span><br>
<span style="font-family:"Calibri","sans-serif";color:#004080"> </span><br>
<span style="font-family:"Calibri","sans-serif";color:#004080">This is the problem that we have that I have told you. I hope you can understand:</span><br>
<span style="font-family:"Calibri","sans-serif";color:#004080"> </span><br>
<span style="font-family:"Calibri","sans-serif";color:#004080">There is a huge harvest in one of our collections and we can’t crawl all the seeds in it. We have a problem with the division in Jobs of the harvest. For example, with version 5.3 we had loaded
15,000 URLs in a harvest and it generated 29 jobs of 620 URLs and one job with the rest .. When update to 5.4, it generate Jobs of 2096 URLs, which creates a local disk problem in the spiders because it is small. We use the same template as in 5.3 but we don’t
know why the division is different. Do you know what this can be? Is there a parameter in NAS (or templates) that we can modify to reduce the number of URLs generated in each job?</span><br>
<span style="font-family:"Calibri","sans-serif";color:#004080"> </span><br>
<span style="font-family:"Calibri","sans-serif";color:#004080">If you have any questions about this, please do not hesitate to ask me.</span><br>
<span style="font-family:"Calibri","sans-serif";color:#004080"> </span><br>
<span style="font-family:"Calibri","sans-serif";color:#004080">Thank you!</span><br>
<span style="font-family:"Calibri","sans-serif";color:#004080"> </span><br>
<span style="font-family:"Arial","sans-serif";color:gray">Alicia Pastrana García</span><br>
<span style="font-family:"Arial","sans-serif";color:gray">Área de Gestión del Depósito de las Publicaciones en Línea</span><br>
<span style="font-family:"Arial","sans-serif";color:gray">División de Procesos y Servicios Digitales</span><br>
<span style="font-family:"Arial","sans-serif";color:gray">Tfno.: 91 516 89 92</span><br>
<span style="font-family:"Arial","sans-serif";color:gray">Biblioteca Nacional de España</span><br>
<br>
<b><span style="font-family:"Tahoma","sans-serif"">De:</span></b><span style="font-family:"Tahoma","sans-serif""> Netarchivesuite-curator [</span><a href="mailto:netarchivesuite-curator-bounces@ml.sbforge.org"><span style="font-family:"Tahoma","sans-serif"">mailto:netarchivesuite-curator-bounces@ml.sbforge.org</span></a><span style="font-family:"Tahoma","sans-serif"">]
<b>En nombre de </b><a href="mailto:geraldine.camile@bnf.fr">geraldine.camile@bnf.fr</a><b><br>
Enviado el:</b> martes, 02 de julio de 2019 11:50<b><br>
Para:</b> <a href="mailto:netarchivesuite-curator@ml.sbforge.org">netarchivesuite-curator@ml.sbforge.org</a><b><br>
CC:</b> <a href="mailto:bert.wendland@bnf.fr">bert.wendland@bnf.fr</a>; <a href="mailto:leslie.bellony-ext@bnf.fr">
leslie.bellony-ext@bnf.fr</a>; <a href="mailto:DDL_DLN@bnf.fr">DDL_DLN@bnf.fr</a>;
<a href="mailto:clara.wiatrowski@bnf.fr">clara.wiatrowski@bnf.fr</a><b><br>
Asunto:</b> [Netarchivesuite-curator] BnF NAS update for July</span><br>
<br>
<span style="font-family:"Arial","sans-serif"">Hello all,</span><br>
<span style="font-family:"Arial","sans-serif""><br>
In March, we launched a selective project crawl for the European elections which is to come to an end in the coming days. 15 curators contributed to the nomination of over 1480 sites among which social networks (twitter mostly but also facebook and Youtube
channels) represent the largest share (around 60%). Eventually, 18 weekly, 5 monthly and over 120 daily crawls were led. We contributed for 85 sites to the collaborative crawl launched by Ricardo Basilio on European elections results.</span><br>
<span style="font-family:"Arial","sans-serif""><br>
We also added our contribution to the collaborative crawl on Artificial intelligence (85 sites).</span><br>
<span style="font-family:"Arial","sans-serif""><br>
Best regards,<br>
The BnF digital legal deposit team</span><o:p></o:p></p>
<div class="MsoNormal" align="center" style="text-align:center">
<hr size="2" width="100%" align="center">
</div>
<p><span style="font-family:"Arial","sans-serif"">Expositions </span><a href="https://www.bnf.fr/fr/agenda/manuscrits-de-lextreme"><b><i><span style="font-family:"Arial","sans-serif"">Manuscrits de l’extrême
</span></i></b></a><span style="font-family:"Arial","sans-serif"">– jusqu'au 7 juillet 2019 | François-Mitterrand<br>
et </span><a href="https://www.bnf.fr/fr/agenda/le-monde-en-spheres"><b><i><span style="font-family:"Arial","sans-serif"">Le Monde en sphères</span></i></b></a><span style="font-family:"Arial","sans-serif""> – jusqu'au 21 juillet 2019 | François-Mitterrand</span><o:p></o:p></p>
<p><b><span style="font-family:"Arial","sans-serif";color:green">Avant d'imprimer, pensez à l'environnement.</span></b><o:p></o:p></p>
<div class="MsoNormal" align="center" style="text-align:center">
<hr size="2" width="100%" align="center">
</div>
<p class="MsoNormal"><span style="font-size:7.5pt">Este mensaje y cualquier fichero adjunto están dirigidos únicamente a sus destinatarios y contiene información confidencial. Si usted ha recibido este correo electrónico por error, le informamos que no puede
realizar ninguna revisión, alteración, impresión, copia, transmisión, difusión ni utilización alguna de este mensaje ni de cualquier fichero adjunto que pudiese contener. La realización de cualquiera de los actos indicados está expresamente prohibida por las
Normas que regulan estas materias. Por todo ello se solicita que, en caso de existir error en la recepción de este mensaje, se lo notifique al remitente respondiendo a este e-mail y elimine el mensaje y su contenido inmediatamente. La Biblioteca Nacional de
España se reserva las acciones legales que le correspondan en el caso de que se infrinja lo indicado anteriormente.</span><o:p></o:p></p>
<div class="MsoNormal" align="center" style="text-align:center">
<hr size="2" width="100%" align="center">
</div>
<p class="MsoNormal"><span style="font-size:7.5pt">The information in this e-mail and any attachments is confidential and it is intended for the addressee only. If you have received this e-mail in error, you are notified that any revision, amendment, print,
copy, disclosure, distribution or use of the contents is unauthorized. Carrying out any of the above actions, is expressly banned by rules governing this matter. Hence we request that if you are not the intended recipient, please notify the sender answering
this e-mail, and delete the message and any attachments. The National Library of Spain reserves itself the right to take the appropriate legal actions in the event of the above mentioned matter is being infringed.</span>
<o:p></o:p></p>
<div class="MsoNormal" align="center" style="text-align:center">
<hr size="2" width="100%" align="center">
</div>
<p class="MsoNormal"><tt><span style="font-size:10.0pt">_______________________________________________</span></tt><span style="font-size:10.0pt;font-family:"Courier New""><br>
<tt>Netarchivesuite-curator mailing list</tt><br>
<tt><a href="mailto:Netarchivesuite-curator@ml.sbforge.org">Netarchivesuite-curator@ml.sbforge.org</a></tt><br>
</span><a href="https://ml.sbforge.org/mailman/listinfo/netarchivesuite-curator"><tt><span style="font-size:10.0pt">https://ml.sbforge.org/mailman/listinfo/netarchivesuite-curator</span></tt></a><o:p></o:p></p>
<div class="MsoNormal" align="center" style="text-align:center"><span style="font-family:"Arial","sans-serif"">
<hr size="2" width="100%" align="center">
</span></div>
<p><span style="font-family:"Arial","sans-serif"">Expositions <em><b><span style="font-family:"Arial","sans-serif""><a href="https://www.bnf.fr/fr/agenda/manuscrits-de-lextreme">Manuscrits de l’extrême
</a></span></b></em>– jusqu'au 7 juillet 2019 | François-Mitterrand<br>
et <em><b><span style="font-family:"Arial","sans-serif""><a href="https://www.bnf.fr/fr/agenda/le-monde-en-spheres">Le Monde en sphères</a></span></b></em> – jusqu'au 21 juillet 2019 | François-Mitterrand<o:p></o:p></span></p>
<p><strong><span style="font-family:"Arial","sans-serif";color:green">Avant d'imprimer, pensez à l'environnement.</span></strong><span style="font-family:"Arial","sans-serif";color:green"><o:p></o:p></span></p>
</div>
<hr width="100%">
<font size="1">Este mensaje y cualquier fichero adjunto están dirigidos únicamente a sus destinatarios y contiene información confidencial. Si usted ha recibido este correo electrónico por error, le informamos que no puede realizar ninguna revisión, alteración,
impresión, copia, transmisión, difusión ni utilización alguna de este mensaje ni de cualquier fichero adjunto que pudiese contener. La realización de cualquiera de los actos indicados está expresamente prohibida por las Normas que regulan estas materias. Por
todo ello se solicita que, en caso de existir error en la recepción de este mensaje, se lo notifique al remitente respondiendo a este e-mail y elimine el mensaje y su contenido inmediatamente. La Biblioteca Nacional de España se reserva las acciones legales
que le correspondan en el caso de que se infrinja lo indicado anteriormente.</font>
<hr width="100%">
<font size="1">The information in this e-mail and any attachments is confidential and it is intended for the addressee only. If you have received this e-mail in error, you are notified that any revision, amendment, print, copy, disclosure, distribution or use
of the contents is unauthorized. Carrying out any of the above actions, is expressly banned by rules governing this matter. Hence we request that if you are not the intended recipient, please notify the sender answering this e-mail, and delete the message
and any attachments. The National Library of Spain reserves itself the right to take the appropriate legal actions in the event of the above mentioned matter is being infringed.</font>
<hr width="100%">
</body>
</html>