[Netarchivesuite-curator] BnF NAS update for February
peter.stirling at bnf.fr
peter.stirling at bnf.fr
Fri Feb 5 11:36:54 CET 2016
Hello all,
Here is an update on part of our regular activity, the ongoing crawls. The
seeds are chosen by librarians in the different departments of the BnF,
and also by partner libraries in Strasbourg and in Montpellier. They cover
websites we absolutely must have in the main fields of knowledge, and each
department or library draws up its collection policy for these crawls as
part of its overall collection policy and within the legal deposit
framework of web archiving.
In 2015, these crawls contained 14,000 seed URLs, which were harvested
with a specific frequency (weekly, monthly, twice a year or annually) and
depth (domain, host, path, page+2). In total, in 2015 we collected 756
million URLs, representing 38 TB.
We will maintain these ongoing crawls in 2016, alongside several project
crawls (World War I, social movements, international publications,
solidarity, official publications, Olympic games).
Best regards,
The BnF digital legal deposit team
Expositions :
Anselm Kiefer, l’alchimie du livre - jusqu'au 7 février 2016 - BnF - François-Mitterrand Avant d'imprimer, pensez à l'environnement.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://ml.sbforge.org/pipermail/netarchivesuite-curator/attachments/20160205/5e2790c1/attachment.html>
More information about the Netarchivesuite-curator
mailing list