[Netarchivesuite-users] Experiencing difficulties with QuickStart system

nicolas.giraud at bnf.fr nicolas.giraud at bnf.fr
Tue Sep 9 18:30:22 CEST 2008


Hi,

I am currently trying to deploy NetArchiveSuite 3.6.0  at the French 
National Library for evaluation purposes. I am toying with the quickstart 
system, but I have a couple of problems.

First I can't get the crawl jobs to work because I did not manage to have 
proxy settings taken into account. My environment is installed on a Debian 
Etch system, in /netarchivesuite/nas-3.6.0. I have used the the "Edit 
Harvest Templates" UI to create a new template based on 
host_10levels_orderxml. The only things I changed are the following lines:

<string name="http-proxy-host"/> changed to <string 
name="http-proxy-host">fw_in.bnf.fr</string>
<string name="http-proxy-port">8080</string> changed to <string 
name="http-proxy-port">8080</string>

I then created a new template by uploading the modified file. I have 
created a configuration using this template for the domains I wish to 
harvest.However this does not seem to be taken into account by the 
Heritrix crawler, so the jobs terminate with the "Domain Completed" 
status.

Second issue, I can't get access to the Heritrix admin console, I have 
tried to edit  /netarchivesuite/nas-3.6.0/conf/settings.xml and change the 
settings.harvester.harvesting.heritrix.guiPort, however whatever  value I 
tried, http://localhost:<guiPort> would return an error 404. I have 
noticed however an heritrix process on port 8092...

And last issue, is there a possibility to shut down the system while not 
losing all defined domains and harvests?

Best regards,

Nicolas Giraud
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://ml.sbforge.org/pipermail/netarchivesuite-users/attachments/20080909/a89da2f0/attachment-0002.html>


More information about the NetarchiveSuite-users mailing list