[Netarchivesuite-devel] troubleshooting deduplication
sara.aubry at bnf.fr
sara.aubry at bnf.fr
Mon Oct 3 09:21:29 CEST 2011
Hi all, and many thanks for your answers.
We have upgraded our IndexServer following your configurations and built a
Virtual Machine with :
- 4 Intel Xeon 2,4 GHz CPUs
- 32 GB RAM
- 3 Fiber Channel Raw Devices of 1 TB merged to a 3 TB ext3 partition
And we finally managed to build our index (upon a 23 TB archive) in 4
days.
So that's better than 26 :-)
Another question: has job generation been modified in the 3.16 release?
We activated a snapshot harvest on Friday, only 297 jobs were created and
have the "New" Status
(it stoped at the c letter). And we found no errors in the
HarvestJobManagerApplication log.
Best,
Sara
Message de : <aponb at gmx.at>
28/09/2011 14:44
Envoyé par :
<netarchivesuite-devel-bounces at ml.sbforge.org>
Veuillez répondre à <netarchivesuite-devel at ml.sbforge.org>
Pour
<netarchivesuite-devel at ml.sbforge.org>
Copie
Objet
Re: [Netarchivesuite-devel] troubleshooting deduplication
> 1) Could you tell us what is the configuration of your index server
(CPU,
> RAM, local disk space vs. nfs partition) and how long did your
> deduplication process last for how much data?
Our Index Server is running on a 2x Intel(R) Core(TM)2 Duo CPU, E4500
@ 2.20GHz, 4 GB RAM machine, 500GB disk space.
We just finished the 1st stage of our current domain crawl and we are
going to start the 2nd stage soon. I will have a look on the duration of
the deduplication process and let you know afterwards.
> 2) Is it possible (have you ever tested) to generate a deduplication
index
> in a test environment and use it in your production environment?
No, we never tried that.
Regards
a.
_______________________________________________
Netarchivesuite-devel mailing list
Netarchivesuite-devel at ml.sbforge.org
http://ml.sbforge.org/mailman/listinfo/netarchivesuite-devel
Exposition Vogue : l'aventure d'une maison de disque - jusqu'au13 novembre 2011 - BnF - François-Mitterrand / Allée Julien Cain Avant d'imprimer, pensez à l'environnement.
More information about the Netarchivesuite-devel
mailing list