[Netarchivesuite-curator] Update ONB
michaela.mayr at onb.ac.at
Wed Apr 25 14:36:22 CEST 2012
* Our Domain crawl is almost finished. Just a few rescheduled jobs are expected to be finished.
* We are using now a small hadoop-Cluster, which is located on our crawler machines. We are using 8 worker nodes which can use 8TB of HDFS Storage. We are now using http://pig.apache.org for sorting our cdx-Index and generating statistical reports.
Web at rchive Austria
Department Digital Preservation
Austrian National Library
fon: (+43 1) 53 410-476
fax: (+43 1) 53 410-610
michaela.mayr at onb.ac.at <mailto:michaela.mayr at onb.ac.at>
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Netarchivesuite-curator