[Netarchivesuite-curator] FEbruary update from Netarchive

Sabine Schostag sas at statsbiblioteket.dk
Mon Feb 24 17:48:36 CET 2014

Dear all,

In the following a brief update on our activities at Netarchive:

We have almost finished our first broad crawl for 2014. We are busy with solving problems we cause for a few website owners when crawling with Heritrix.
Often the problem is Heritrix aggressively inventing url’s because of some javascript on the given webpage, which Heritrix doesn’t understand.

We participated in the IIPC event harvest of the Winter Olympics in Sochi and at the same time we did our own event harvest  on this event.

We are also busy with an event harvest on the European song contest which will take place in Denmark this year. For this event harvest we are assisted by 2 researchers who conduct a research project on the European song contest.  This is the first time, researchers are directly involved in an event harvest from the very beginning of the harvest.


Sabine Schostag
Web curator
[cid:image001.png at 01CF3188.A4637B10] STATE AND UNIVERSITY LIBRARY
Victor Albecks Vej 1
VAT NO. 1010 0682
DIRECT +45 8946 2148

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://ml.sbforge.org/pipermail/netarchivesuite-curator/attachments/20140224/39204997/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 584 bytes
Desc: image001.png
URL: <http://ml.sbforge.org/pipermail/netarchivesuite-curator/attachments/20140224/39204997/attachment-0001.png>

More information about the Netarchivesuite-curator mailing list