[Netarchivesuite-curator] Absolute last minute Netarchive update for April ; o)

Sabine Schostag sas at statsbiblioteket.dk
Wed May 1 09:51:15 CEST 2013

Hi all.

Since the end of March we had our main focus on an event crawl of one of the biggest lockouts of Danish history: all Danish school Teachers had been locked out since 1st of April.  This event had a rather big impact on Danish society. As school children could not come to school, the parents had to take care of them. Some had the possibility to take their children to their work place, others took on holidays. Our government has just passed a law, which finished the lockout.

One of the big issues in the event crawl was harvesting YouTube videos, still a kind of manually process. Our documentation of the procedure and the tool used is not finished, we will put it on the wiki, when we are ready for that.

Furthermore we have finished our first broad crawl for 2013 for a couple of weeks ago. We harvested about 30 TB.  Next week we are going to start our next brad crawl, this time with WARC.


DIREKTE 8946 2148
[cid:image001.png at 01CE4651.6BD5EBC0]STATSBIBLIOTEKET
CVR/SE 1010 0682 – EAN 579800079108

From: netarchivesuite-curator-bounces at ml.sbforge.org [mailto:netarchivesuite-curator-bounces at ml.sbforge.org] On Behalf Of peter.stirling at bnf.fr
Sent: Friday, April 12, 2013 5:23 PM
To: netarchivesuite-curator at ml.sbforge.org
Subject: [Netarchivesuite-curator] BnF NAS update for April

Hello all,

During March we finished our first semestrial crawl of the year. This represents the second-largest part of our focused crawls after the annual crawl, which will take place in May.

We have also been working to improve our crawl of videos on Dailymotion by stopping Heritrix from collecting multiple copies of the same videos. We will let you know the results once the crawl is complete.

Best regards,
The BnF digital legal deposit team

Exposition Guy Debord, un art de la guerre<http://www.bnf.fr/fr/evenements_et_culture/anx_expositions/f.debord.html> - du 27 mars au 13 juillet 2013 - BnF - François-Mitterrand / Grande Galerie

Avant d'imprimer, pensez à l'environnement.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://ml.sbforge.org/pipermail/netarchivesuite-curator/attachments/20130501/6bcd5b11/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 588 bytes
Desc: image001.png
URL: <http://ml.sbforge.org/pipermail/netarchivesuite-curator/attachments/20130501/6bcd5b11/attachment.png>

More information about the Netarchivesuite-curator mailing list