[Netarchivesuite-users] Example configuration for a single-sitesetup

Søren Vejrup Carlsen svc at kb.dk
Wed Mar 4 14:19:05 CET 2009


Hi Nicolas.

Are your deployment environment unix-machines, entirely?

---------------------------------------------------------------------------
Søren Vejrup Carlsen, NetarchiveSuite developer

Department of Digital Preservation, Royal Library, Copenhagen, Denmark 
tlf: (+45) 33 47 48 41
email: svc at kb.dk <mailto:svc at kb.dk> 
----------------------------------------------------------------------------
Non omnia possumus omnes
--- Macrobius, Saturnalia, VI, 1, 35 -------

 

 

 

Fra: netarchivesuite-users-bounces at lists.gforge.statsbiblioteket.dk [mailto:netarchivesuite-users-bounces at lists.gforge.statsbiblioteket.dk] På vegne af nicolas.giraud at bnf.fr
Sendt: 3. marts 2009 17:36
Til: netarchivesuite-users at lists.gforge.statsbiblioteket.dk
Emne: [Netarchivesuite-users] Example configuration for a single-sitesetup

 


Hi,

We are in the process of deploying NetArchive Suite to drive our harvest definitions and crawls at the French National Library. So far I have setup a one machine sandbox-type installation of the suite, based on the simple_harvest environment. I have read multiple times the Intallation Manual, but really I believe there is too much theorical information there, and too few examples. I am very confused as where to start. I would like to see some example configuration files for a single-site setup scenario, that would make things a lot more clear.

Let me explain quickly what we intend to do. Currently we have our ARC files located on data nodes, Petaboxes that were delivered to us by Internet Archive. Now we are moving to be autonomous on our crawls. So basically our setup would have : 

- multiple machines to store ARC files (without redundancy)
- multiple machines to host Heritrix crawlers and perform indexing
- one machine handling the definition of harvests

First I would like to setup a development environment with 3 machines (either 3 physical machines or using virtualization) :

- one harvest definition machine
- one crawler machine
- one storage machine

I would use mySQL for the database specifics.

Is it possible to have sample configuration files, or a kind of tutorial for such a deployment environment? I think I can work out something starting from the simple harvest setup but I'd appreciate some guidance to proceed faster.

Thanks in advance,

Nicolas Giraud


Avant d'imprimer, pensez à l'environnement.
Consider the environment before printing this mail.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://ml.sbforge.org/pipermail/netarchivesuite-users/attachments/20090304/83f6db96/attachment-0002.html>


More information about the NetarchiveSuite-users mailing list