[Netarchivesuite-curator] BnF NAS update for May

peter.stirling at bnf.fr peter.stirling at bnf.fr
Tue May 14 13:26:43 CEST 2013

Hello all,

This month we thought we'd give a quick summary of the different selective 
crawls we are doing this year. 

We distinguish between "ongoing crawls", in which librarians  in the 
different departments in the BnF select seeds based on the collection 
policy of their department, and  "project crawls", which are 
collaborations between two or more departments, sometimes with external 
partners, based around a particular theme or an event. 

For ongoing crawls there is a choice of four depths, four frequencies and 
three budgets. The use of budgets (small, medium or large) allows us to 
plan and monitor the crawls more efficiently; in terms of harvest 
definitions in NAS, for the twice-yearly and annual crawls we create 
harvest definitions for each budget, while weekly and monthly crawls are 
only given a "small" budget.  Project crawls can have a different range of 
technical settings for specific reasons.

The harvest definitions for "ongoing crawls" are as follows:
- weekly - launched every Monday at noon
- monthly - launched the first of each month
- twice-yearly (small, medium and large budgets) - the first crawl took 
place in February/March, and the second will be launched in August
- annual (small, medium and large budgets) - the yearly crawl has been 
launched this week.

The list of "project crawls" for 2013 is as follows:
- news sites - around 100 sites crawled every day, at a depth of homepage 
plus 1 click.
- subscription news sites - we are progressively adding titles to our 
crawl to collect subscription editions of news sites (5 at the moment).
- online journals - twice a year, personal and literary blogs. The first 
crawl was completed in March and the second will be held in August.
- videos - once a year, currently limited to Dailymotion. We have just 
finished this crawl and will give more details in next month's update.
- solidarity and social movements - two project crawls on social issues in 
France, to be launched in May and June.
- blogs - once a year, to improve the collection of blog platforms that 
are poorly covered in the broad crawl. The crawl will be launched in June.
- auction houses - annual crawl of auction catalogues, to be launched in 
- travel journals - a crawl of online travel journals, also in June.
- official publications - annual crawl of government websites and 
publications; takes place in July.
- US official publications - crawl of US governement publications under 
the IDEA agreement to replace exchanges of paper documents with electronic 
versions; also takes place in July.
- Jean-Philippe Rameau - a crawl for next year's 250th anniversary of the 
death of this French composer, the crawl is planned for September.

If you would like any more information on our programme of selective 
crawls please let us know.

One last point - please note that Annick has changed her name, from 
LORTHIOS to LE FOLLIC. You can now contact her with this email address: 
annick.lefollic at bnf.fr

Best regards,
The BnF Digital Legal Deposit team
Exposition  Guy Debord, un art de la guerre  - du 27 mars au 13 juillet 2013 - BnF - François-Mitterrand / Grande Galerie Avant d'imprimer, pensez à l'environnement. 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://ml.sbforge.org/pipermail/netarchivesuite-curator/attachments/20130514/1ca5dd0e/attachment.html>

More information about the Netarchivesuite-curator mailing list