[Netarchivesuite-curator] BnF NAS update for October

peter.stirling at bnf.fr peter.stirling at bnf.fr
Wed Oct 15 17:21:20 CEST 2014


Since February 2013 we have been transferring our web archive into the BnF 
repository for long-term preservation (SPAR). It was decided that the 
priority for the ingest would be the current production: as a result all 
the crawls from 2013 have now been ingested, as have those for 2014 so 
far. Some retrospective collections have been copied too, with almost half 
of 2012 crawls, including some of the video crawls. In total, 57 % of our 
crawls performed by NAS are now in SPAR (156 TB) (around 31 % of our total 

Developments are being made for SPAR to ingest WARC files (as up to now 
all our crawls have been in ARC). However these won't be effective before 
the beginning of 2015. The ingest of the ongoing crawls has had to be 
stopped as our production has changed to WARC as of Monday - we will give 
you more details on that in our next update. Only retrospective 
collections will be ingested during this temporary period.

At the same time, we have ingested other legal deposit materials which are 
not harvested using Heritrix and NAS. Large format posters have to be 
deposited at the BnF but a negociation between the library and the 
producer allowed them to be deposited not as paper (because they are so 
huge) but in digital form (high quality PDF). These posters are now 
preserved in SPAR too.

Best regards,
The BnF digital legal deposit team

Participez à l'acquisition d'un Trésor national - Le manuscrit royal de François I er Avant d'imprimer, pensez à l'environnement. 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://ml.sbforge.org/pipermail/netarchivesuite-curator/attachments/20141015/511ad87f/attachment.html>

More information about the Netarchivesuite-curator mailing list