[Netarchivesuite-curator] Status update Netarchivet
Sabine Schostag
sas at statsbiblioteket.dk
Wed May 30 14:31:58 CEST 2012
Dear all,
Here comes a short update from KB/SB (so you, Michaela, you were faster than I :) ):
· Step 1 of broad crawl number 2/2012 was finished at the 14th of May, step 2 started with a delay of 8-9 days (we ran short of disc space). About 38.000 domains are blocking for our harvesters, because our aggressive harvesting causes annoyance our even breakdowns on their servers.
· Prior to the Heritrix 3 workshop organized by archive.org, the curators at KB/SB will come up with a survey on our actual use of Heritrix 1 and on wishes for future functionalities.
· We still focus on possibilities of harvesting video and hope that we can learn from BNF’s experiences with DailyMotion and re-use their script(s).
Best,
Sabine
SABINE SCHOSTAG
BIBLIOTEKAR - WEBKURATOR
DIREKTE 8946 2148
[cid:image003.png at 01CD3E70.F7F63720]STATSBIBLIOTEKET
VICTOR ALBECKS VEJ 1
8000 AARHUS C
CVR/SE 1010 0682 – EAN 5798000791084
From: netarchivesuite-curator-bounces at ml.sbforge.org [mailto:netarchivesuite-curator-bounces at ml.sbforge.org] On Behalf Of Mayr Michaela
Sent: Wednesday, May 30, 2012 2:03 PM
To: netarchivesuite-curator at ml.sbforge.org
Subject: [Netarchivesuite-curator] Status update ONB
Dear all,
* We finished our second domain crawl and will now begin to crawl governmental and academic websites. We used NAS Version 3.16.1 and will now change to 3.18.3
* We will analyse the domain crawl and compare with the previous crawl.
* Resources for the webarchiving project have been reduced. For the duration of 1.5 years Andreas will commit 80% and Michaela 50% to webarchiving.
Best
Michaela
Mag. Michaela Mayr
Web at rchiv Österreich
Abteilung für Langzeitarchivierung
Österreichische Nationalbibliothek
Josefsplatz 1
1015 Wien
Tel: (+43 1) 53 410-476
Fax: (+43 1) 53 410-610
FN221029v
FBG Handelsgericht Wien
michaela.mayr at onb.ac.at<mailto:michaela.mayr at onb.ac.at>
http://www.onb.ac.at/about/webarchivierung.htm
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://ml.sbforge.org/pipermail/netarchivesuite-curator/attachments/20120530/652dd393/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image003.png
Type: image/png
Size: 584 bytes
Desc: image003.png
URL: <http://ml.sbforge.org/pipermail/netarchivesuite-curator/attachments/20120530/652dd393/attachment.png>
More information about the Netarchivesuite-curator
mailing list