[Netarchivesuite-curator] Status update Netarchivet

Sabine Schostag sas at statsbiblioteket.dk
Wed May 30 14:31:58 CEST 2012


Dear all,

Here comes a short update from KB/SB (so you, Michaela, you were faster than I :) ):


·         Step 1 of broad crawl number 2/2012 was finished at the 14th of May, step 2 started with a delay of 8-9 days (we ran short of disc space). About 38.000 domains are blocking for our harvesters, because our aggressive harvesting causes annoyance our even breakdowns on their servers.

·         Prior to the Heritrix 3 workshop organized by archive.org, the curators at KB/SB  will come up with a survey on our actual use of Heritrix 1 and on wishes for future functionalities.

·         We still focus on possibilities of harvesting video and hope that we can learn from BNF’s experiences with DailyMotion and re-use their script(s).

Best,
Sabine

SABINE SCHOSTAG
BIBLIOTEKAR - WEBKURATOR
DIREKTE 8946 2148

[cid:image003.png at 01CD3E70.F7F63720]STATSBIBLIOTEKET

VICTOR ALBECKS VEJ 1
8000 AARHUS C

CVR/SE 1010 0682 – EAN 5798000791084


From: netarchivesuite-curator-bounces at ml.sbforge.org [mailto:netarchivesuite-curator-bounces at ml.sbforge.org] On Behalf Of Mayr Michaela
Sent: Wednesday, May 30, 2012 2:03 PM
To: netarchivesuite-curator at ml.sbforge.org
Subject: [Netarchivesuite-curator] Status update ONB

Dear all,

 *   We finished our second domain crawl and will now begin to crawl governmental and academic websites. We used NAS Version 3.16.1 and will now change to 3.18.3
 *   We will analyse the domain crawl and compare with the previous crawl.
 *   Resources for the webarchiving project have been reduced. For the duration of 1.5 years Andreas will commit 80% and Michaela 50% to webarchiving.
Best
Michaela

Mag. Michaela Mayr
Web at rchiv Österreich
Abteilung für Langzeitarchivierung
Österreichische Nationalbibliothek
Josefsplatz 1
1015 Wien
Tel:  (+43 1) 53 410-476
Fax: (+43 1) 53 410-610
FN221029v
FBG Handelsgericht Wien
michaela.mayr at onb.ac.at<mailto:michaela.mayr at onb.ac.at>
http://www.onb.ac.at/about/webarchivierung.htm

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://ml.sbforge.org/pipermail/netarchivesuite-curator/attachments/20120530/652dd393/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image003.png
Type: image/png
Size: 584 bytes
Desc: image003.png
URL: <http://ml.sbforge.org/pipermail/netarchivesuite-curator/attachments/20120530/652dd393/attachment.png>


More information about the Netarchivesuite-curator mailing list