[Netarchivesuite-devel] troubleshooting deduplication

sara.aubry at bnf.fr sara.aubry at bnf.fr
Mon Oct 3 09:21:29 CEST 2011


Hi all, and many thanks for your answers.
We have upgraded our IndexServer following your configurations and built a 
Virtual Machine with :
- 4  Intel Xeon 2,4 GHz CPUs
- 32 GB RAM
- 3 Fiber Channel Raw Devices of 1 TB merged to a 3 TB ext3 partition
And we finally managed to build our index (upon a 23 TB archive) in 4 
days.
So that's better than 26 :-)

Another question: has job generation been modified in the 3.16 release?
We activated a snapshot harvest on Friday, only 297 jobs were created and 
have the "New" Status
(it stoped at the c letter). And we found no errors in the 
HarvestJobManagerApplication log.

Best,

Sara
 








Message de : <aponb at gmx.at> 
                      28/09/2011 14:44

Envoyé par : 
<netarchivesuite-devel-bounces at ml.sbforge.org>

Veuillez répondre à <netarchivesuite-devel at ml.sbforge.org>



Pour
<netarchivesuite-devel at ml.sbforge.org>
Copie

Objet
Re: [Netarchivesuite-devel] troubleshooting deduplication




> 1) Could you tell us what is the configuration of your index server 
(CPU,
> RAM, local disk space vs. nfs partition) and how long did your
> deduplication process last for how much data?
Our Index Server is running on a 2x Intel(R) Core(TM)2 Duo CPU,  E4500 
@ 2.20GHz, 4 GB RAM machine, 500GB disk space.
We just finished the 1st stage of our current domain crawl and we are 
going to start the 2nd stage soon. I will have a look on the duration of 
the deduplication process and let you know afterwards.
> 2) Is it possible (have you ever tested) to generate a deduplication 
index
> in a test environment and use it in your production environment?
No, we never tried that.

Regards
a.
_______________________________________________
Netarchivesuite-devel mailing list
Netarchivesuite-devel at ml.sbforge.org
http://ml.sbforge.org/mailman/listinfo/netarchivesuite-devel



Exposition  Vogue : l'aventure d'une maison de disque  - jusqu'au13 novembre 2011 - BnF - François-Mitterrand / Allée Julien Cain Avant d'imprimer, pensez à l'environnement. 


More information about the Netarchivesuite-devel mailing list