[Netarchivesuite-devel] troubleshooting deduplication
sara.aubry at bnf.fr
sara.aubry at bnf.fr
Fri Sep 16 14:02:17 CEST 2011
Hello everyone,
As I mentionned during our last teleconference, we are testing
NetarchiveSuite 3.16.1 and a new architecture to launch our annual broad
crawl.
We activated the harvest on August 23 (almost 3 weeks ago!) and the
deduplication index is still ready!
1) Could you tell us what is the configuration of your index server (CPU,
RAM, local disk space vs. nfs partition) and how long did your
deduplication process last for how much data?
2) Is it possible (have you ever tested) to generate a deduplication index
in a test environment and use it in your production environment?
We hope to be able to end our deduplication process and use the index...
3) When a job starts, how does the index server know that an index has
already been created?
Many thanks for your answers.
Sara
Fermeture annuelle des sites François-Mitterrand et Richelieu - du lundi 5 au dimanche 18 septembre 2011 inclus
Journée du patrimoine - samedi 17 septembre (Sablé-sur-Sarthe et Maison Jean-Vilar à Avignon) et dimanche 18 septembre (autres sites, dont François-Mitterrand et Richelieu) Avant d'imprimer, pensez à l'environnement.
More information about the Netarchivesuite-devel
mailing list