<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<!--[if !mso]><style>v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style><![endif]--><style><!--
/* Font Definitions */
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:Calibri;
panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
margin-bottom:.0001pt;
font-size:12.0pt;
font-family:"Times New Roman",serif;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
{mso-style-priority:99;
color:purple;
text-decoration:underline;}
p
{mso-style-priority:99;
mso-margin-top-alt:auto;
margin-right:0cm;
mso-margin-bottom-alt:auto;
margin-left:0cm;
font-size:12.0pt;
font-family:"Times New Roman",serif;}
p.msonormal0, li.msonormal0, div.msonormal0
{mso-style-name:msonormal;
mso-style-priority:99;
mso-margin-top-alt:auto;
margin-right:0cm;
mso-margin-bottom-alt:auto;
margin-left:0cm;
font-size:12.0pt;
font-family:"Times New Roman",serif;}
span.EmailStyle19
{mso-style-type:personal-reply;
font-family:"Calibri",sans-serif;
color:#1F497D;}
.MsoChpDefault
{mso-style-type:export-only;
font-size:10.0pt;
font-family:"Calibri",sans-serif;
mso-fareast-language:EN-US;}
@page WordSection1
{size:612.0pt 792.0pt;
margin:3.0cm 2.0cm 3.0cm 2.0cm;}
div.WordSection1
{page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="DA" link="blue" vlink="purple">
<div class="WordSection1">
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US">Hi Alexandre<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US">Thanks for the update – really impressive broad crawl.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US">We are experiencing a slow broad crawl this time – and we are investigating further.
<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US">We are still awaiting to use SolrWayback for our archive (Solrindex) – hopefully this month.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US">In the meanwhile everything is available as open source here:<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US"><a href="https://github.com/netarchivesuite/solrwayback/releases/tag/4.0.5">https://github.com/netarchivesuite/solrwayback/releases/tag/4.0.5</a>
(prerelease)<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US"><a href="https://github.com/netarchivesuite/solrwayback">https://github.com/netarchivesuite/solrwayback</a><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US">Might be interesting for you to see the covid-19 collection through SolrWayback once you have indexed everything . I’ll
be happy to show you some of the new features.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US">Best,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US">Anders<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D;mso-fareast-language:EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><b><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri",sans-serif">From:</span></b><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri",sans-serif"> Netarchivesuite-curator <netarchivesuite-curator-bounces@ml.sbforge.org>
<b>On Behalf Of </b>alexandre.chautemps@bnf.fr<br>
<b>Sent:</b> Tuesday, December 8, 2020 6:18 PM<br>
<b>To:</b> netarchivesuite-curator@ml.sbforge.org<br>
<b>Cc:</b> bert.wendland@bnf.fr; DDL_DLN@bnf.fr; leslie.bellony-ext@bnf.fr<br>
<b>Subject:</b> [Netarchivesuite-curator] BnF NAS update for December<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:10.0pt;font-family:"Arial",sans-serif">Dear all,</span><span lang="EN-US"><br>
<br>
</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Arial",sans-serif">Our annual broad crawl has ended on 7th of November. It lasted 32 days, executed 1037 jobs, and crawled 2,455 billions of URLs for a size of 117,59 TB (compressed).</span><span lang="EN-US"><br>
<br>
</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Arial",sans-serif">The French newspaper Liberation contacted our team to inform us that their blog platform (</span><a href="https://www.liberation.fr/blogs,26"><span lang="EN-US" style="font-size:10.0pt;font-family:"Arial",sans-serif">https://www.liberation.fr/blogs,26</span></a><span lang="EN-US" style="font-size:10.0pt;font-family:"Arial",sans-serif">)
would be closed in the course of December. The platform hosts more than 300 blogs. We launched an emergency crawl last week to crawl these blogs and preserve them.</span><span lang="EN-US"><br>
<br>
</span><span lang="EN-US" style="font-size:10.0pt;font-family:"Arial",sans-serif">We are working on the full text indexation (with Solr) of our covid-19 crawl performed between February and July of 2020 and covering the first wave of the pandemic. The size
of this collection is about 15 TB (compressed). The new collection will be put in production during december and will be available to the readers through the GUI Archives de l'internet Labs.</span><span lang="EN-US"><br>
<br>
</span><span style="font-size:10.0pt;font-family:"Arial",sans-serif">Best regards,</span><br>
<br>
<span style="font-size:10.0pt;font-family:"Arial",sans-serif">The BnF digital legal deposit team</span><span style="font-family:"Arial",sans-serif"><o:p></o:p></span></p>
<div class="MsoNormal" align="center" style="text-align:center"><span style="font-family:"Arial",sans-serif">
<hr size="3" width="100%" align="center">
</span></div>
<p><strong><span style="font-family:"Arial",sans-serif">Ouverture partielle des salles de recherche</span></strong><span style="font-family:"Arial",sans-serif"><br>
La biblioth?que tous publics (Haut de jardin) et les expositions restent ferm?es.<br>
Les salles de recherche?sont ouvertes aux lecteurs titulaires de pass recherche uniquement sur r?servation et exclusivement pour la consultation d'ouvrages r?serv?s (</span><a href="https://www.bnf.fr/fr/reouverture-des-salles-de-lecture-de-la-bibliotheque-de-recherche"><span style="font-family:"Arial",sans-serif">voir
modalit?s ici</span></a><span style="font-family:"Arial",sans-serif">) ? partir du 24 novembre, du mardi au vendredi et de 10 h ? 17 h.<o:p></o:p></span></p>
<p><strong><span style="font-family:"Arial",sans-serif;color:green">Avant d'imprimer, pensez ? l'environnement.</span></strong><span style="font-family:"Arial",sans-serif;color:green"><o:p></o:p></span></p>
</div>
</body>
</html>