[Netarchivesuite-curator] Update ONB

Mayr Michaela michaela.mayr at onb.ac.at
Wed Apr 25 14:36:22 CEST 2012

Hi all,

*	Our Domain crawl is almost finished. Just a few rescheduled jobs are expected to be finished. 
*	We are using now a small hadoop-Cluster, which is located on our crawler machines. We are using 8 worker nodes which can use 8TB of HDFS Storage. We are now using http://pig.apache.org for sorting our cdx-Index and generating statistical reports.



Michaela Mayr
Web at rchive Austria
Department Digital Preservation 
Austrian National Library
Josefsplatz 1
A-1015 Vienna
fon: (+43 1) 53 410-476
fax: (+43 1) 53 410-610
michaela.mayr at onb.ac.at <mailto:michaela.mayr at onb.ac.at> 
http://www.onb.ac.at/ev/about/webarchive.htm <http://www.onb.ac.at/ev/about/webarchive.htm> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://ml.sbforge.org/pipermail/netarchivesuite-curator/attachments/20120425/c0ae2838/attachment.html>

More information about the Netarchivesuite-curator mailing list