[Netarchivesuite-curator] Update from Netarchive

Sabine Schostag sas at statsbiblioteket.dk
Mon Jun 13 13:13:23 CEST 2016


Dear all.

Hereby the monthly update from KB/SB:


Broad crawl
We started the second broad crawl 2016 with a limit of 100 MB from each domain to be crawled.

Event crawls
We stopped the refugee crisis crawl. We did a smaller event crawl for the “Eurovision Song Contest”, were we focused on the Danish participants presence on Twitter and on thematic news sections. We are preparing for a crawl of the Olympic in Rio.

Selctive crawls
We started the implementatoin of our revised collection strategy. We have almost established the new selective crawls of national news sites.

One of the first social media platforms, arto.com, closed at 1st  June. We had problems with our last complete crawl before the closing. With a specially developed modul, where the FetchDNS method is changed, we hope to be able to get all content directly from their server.

Potential collaboration project
The Parliamentary Library gives inhouse access to historical (archived) versions of the political parties’ websites. They are not quite satisfied with their solution. Netarchive and the Parliamentary Library are looking at potential future cooperation on this subject.

Internal
Niels Bønding is project lead for curation now.


Best,
Sabine

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://ml.sbforge.org/pipermail/netarchivesuite-curator/attachments/20160613/1b8b1c5d/attachment.html>


More information about the Netarchivesuite-curator mailing list