[Netarchivesuite-curator] BnF NAS update for May
peter.stirling at bnf.fr
peter.stirling at bnf.fr
Tue May 14 13:26:43 CEST 2013
This month we thought we'd give a quick summary of the different selective
crawls we are doing this year.
We distinguish between "ongoing crawls", in which librarians in the
different departments in the BnF select seeds based on the collection
policy of their department, and "project crawls", which are
collaborations between two or more departments, sometimes with external
partners, based around a particular theme or an event.
For ongoing crawls there is a choice of four depths, four frequencies and
three budgets. The use of budgets (small, medium or large) allows us to
plan and monitor the crawls more efficiently; in terms of harvest
definitions in NAS, for the twice-yearly and annual crawls we create
harvest definitions for each budget, while weekly and monthly crawls are
only given a "small" budget. Project crawls can have a different range of
technical settings for specific reasons.
The harvest definitions for "ongoing crawls" are as follows:
- weekly - launched every Monday at noon
- monthly - launched the first of each month
- twice-yearly (small, medium and large budgets) - the first crawl took
place in February/March, and the second will be launched in August
- annual (small, medium and large budgets) - the yearly crawl has been
launched this week.
The list of "project crawls" for 2013 is as follows:
- news sites - around 100 sites crawled every day, at a depth of homepage
plus 1 click.
- subscription news sites - we are progressively adding titles to our
crawl to collect subscription editions of news sites (5 at the moment).
- online journals - twice a year, personal and literary blogs. The first
crawl was completed in March and the second will be held in August.
- videos - once a year, currently limited to Dailymotion. We have just
finished this crawl and will give more details in next month's update.
- solidarity and social movements - two project crawls on social issues in
France, to be launched in May and June.
- blogs - once a year, to improve the collection of blog platforms that
are poorly covered in the broad crawl. The crawl will be launched in June.
- auction houses - annual crawl of auction catalogues, to be launched in
- travel journals - a crawl of online travel journals, also in June.
- official publications - annual crawl of government websites and
publications; takes place in July.
- US official publications - crawl of US governement publications under
the IDEA agreement to replace exchanges of paper documents with electronic
versions; also takes place in July.
- Jean-Philippe Rameau - a crawl for next year's 250th anniversary of the
death of this French composer, the crawl is planned for September.
If you would like any more information on our programme of selective
crawls please let us know.
One last point - please note that Annick has changed her name, from
LORTHIOS to LE FOLLIC. You can now contact her with this email address:
annick.lefollic at bnf.fr
The BnF Digital Legal Deposit team
Exposition Guy Debord, un art de la guerre - du 27 mars au 13 juillet 2013 - BnF - François-Mitterrand / Grande Galerie Avant d'imprimer, pensez à l'environnement.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Netarchivesuite-curator