[Netarchivesuite-users] more upload errors

Bjarne Andersen netarkivet at statsbiblioteket.dk
Wed Jul 30 08:58:25 CEST 2008


The problem was the viewerproxy application.

Each application on one server needs a unique port-number in the setting: -Dsettings.common.http.port=
    - This is because of several reasons:
      a) The portnumber is used in the JMS-queue naming
      b) The portnumber is used to LISTEN for server-applications (GUIApplication and ViewerproxyApplication)

Your problem was that both your harvester and your viewerproxy had the same -Dsettings.common.http.port=8081.

A bug (in my opinion) is that 3 different applications use "COMMON_THIS_HACO_" in the naming of their JMS-queue:
  - Harvester (seems OK - HACO meens HarvesterController)
  - Viewerproxy (NOT OK - could be something like VIEW)
  - Indexserver (NOT OK - could be something like INDX)

best
Bjarne Andersen

Martin Bella wrote:
> I thought it was not necessary to post how harvester.sh script looks like. The
> first reason is, it was created by copy&paste method and it contains the same
> lines for starting the HarvesterControllerApplication as the netarchive.sh
> script. Probably I could do it the same way as you. I will play with this
> later. The second reason is, when I make a change, I almost always start with a
> clean install. And it fails.
> 
> If you are interested, I think in our organisation we could create a special
> limited account for you on one of our testing machines so that you could have a
> closer look at this problem.
> 
> Best,
> Martin Bella
> University Library in Bratislava
> 
> 
>>It might be a problem with the HarvesterControllerApplication and the
>>SideKick.
>>
>>The HarvesterControllerApplication is implemented in such way that it
>>destroys itself after it has finished a job. The SideKick ensures to 
>>restart that specific harvester again. So each HarvesterControllerServer has
>>a corrosponding SideKick.
>>
>>The SideKick must know of a script to restart the harvester - you have to
>>make sure that the way you start the harvester in your init-script 
>>is the same way the SideKick starts it after each job - preferable by
>>calling the same script starting the HarvesterControllerServer.
>>
>>So you might want to put the startup of the HarvesterControllerServer in
>>its own little script and call that from your init script (remember 
>>your global set variables)
>>
>>In our startup-scripts (automatically generated with the DeployApplication)
>>one specific harvester has the following start-script:
>>#!/bin/bash
>>export 
>>CLASSPATH=/home/prod/PROD/lib/dk.netarkivet.harvester.jar:/home/prod/PROD/lib/dk.netarkivet.archive.jar:/home/prod/PROD/lib/dk.netarkivet.viewerproxy.jar:/home/prod/PROD/lib/dk.netarkivet.monitor.jar:$CLASSPATH;
>>cd /home/prod/PROD
>>java -Xmx1536m  -Dsettings.harvester.harvesting.heritrix.guiPort=8095 
>>-Dsettings.harvester.harvesting.heritrix.jmxPort=8195 
>>-Ddk.netarkivet.settings.file=/home/prod/PROD/conf/settings_harvester_8081.xml 
>>-Dorg.apache.commons.logging.Log=org.apache.commons.logging.impl.Jdk14Logger 
>>-Djava.util.logging.config.file=/home/prod/PROD/conf/log_harvestcontrollerapplication.prop -Dsettings.common.jmx.port=8110 
>>-Dsettings.common.jmx.rmiPort=8210 -Dsettings.common.jmx.passwordFile=/home/prod/PROD/conf/jmxremote.password -Djava.security.manager 
>>-Djava.security.policy=/home/prod/PROD/conf/security.policy 
>>dk.netarkivet.harvester.harvesting.HarvestControllerApplication < /dev/null
>>
>>start_harvester_8081.sh.log 2>&1 &
>>
>>And the matching SideKick has the following startup-script:
>>java -Xmx1536m -Ddk.netarkivet.settings.file=/home/prod/PROD/conf/settings_harvester_8081.xml 
>>-Dorg.apache.commons.logging.Log=org.apache.commons.logging.impl.Jdk14Logger 
>>-Djava.util.logging.config.file=/home/prod/PROD/conf/log_sidekick.prop
>>-Dsettings.common.jmx.port=8111 -Dsettings.common.jmx.rmiPort=8211 
>>-Dsettings.common.jmx.passwordFile=/home/prod/PROD/conf/jmxremote.password
>>-Djava.security.manager 
>>-Djava.security.policy=/home/prod/PROD/conf/security.policy 
>>dk.netarkivet.harvester.sidekick.SideKick 
>>dk.netarkivet.harvester.sidekick.HarvestControllerServerMonitorHook
>>./conf/start_harvester_8081.sh  < /dev/null > start_sidekick_8081.sh.log 
>>2>&1 &
>>
>>You can see that the SideKick has an argument wich is exactly the
>>startup-script of the HarvesterControllerServer.
>>
>>I can see that your init-script references /home/user/workspace/netarchive/harvester.sh - put the startup of the Harvester in that file.
>>
>>Even given your current setup I would imagine that if things should fail it
>>sould fail not until the second job is run because of the 
>>potential failed restart of the HarvesterControllerServer.
>>
>>So try with a clean install - clean JMS-broker and see the imqcmd output
>>before doing any jobs - should only have one consumer on the 
>>Harvester queue
>>
>>best
>>Bjarne Andersen
>>
>>Martin Bella wrote:
>>
>>>I do not think I have any java processes from old installations. At the time of
>>>uploading the ps command showed only one instance of JMS Broker, BitarchiveApplication, GUIApplication, ArcRepositoryApplication, BitarchiveMonitorApplication, HarvestControllerApplication, SideKick, IndexServerApplication and ViewerProxyApplication.
>>>
>>>Btw. here is, how my init script on my testing machine looks like:
>>>
>>>#!/bin/bash
>>>
>>>export NetarchiveDir=/home/user/workspace/netarchive
>>>export CLASSPATH=$CLASSPATH:$NetarchiveDir/lib/dk.netarkivet.harvester.jar
>>>export CLASSPATH=$CLASSPATH:$NetarchiveDir/lib/dk.netarkivet.archive.jar
>>>export CLASSPATH=$CLASSPATH:$NetarchiveDir/lib/dk.netarkivet.viewerproxy.jar
>>>export CLASSPATH=$CLASSPATH:$NetarchiveDir/lib/dk.netarkivet.monitor.jar
>>>export JAVA_OPTS=-Xmx2048m
>>>export LOG_SETTINGS="-Dorg.apache.commons.logging.Log=org.apache.commons.logging.impl.Jdk14Logger -Djava.util.logging.config.file=$NetarchiveDir/conf/log.prop"
>>>
>>>cd $NetarchiveDir
>>>
>>># Bitarchive machines
>>>
>>>export JMX_SETTINGS="-Dsettings.common.jmx.port=8150 -Dsettings.common.jmx.rmiPort=8250"
>>>export APP_OPTIONS="-Dsettings.archive.bitarchive.thisLocation=sos
>>>-Dsettings.archive.bitarchive.thisCredentials=examplecredentials"
>>>export APP=dk.netarkivet.archive.bitarchive.BitarchiveApplication
>>>/opt/jdk1.5.0_16/bin/java $JAVA_OPTS $SETTING $LOG_SETTINGS $JMX_SETTINGS
>>>$APP_OPTIONS $APP &
>>>
>>># Admin machine
>>>
>>>export JMX_SETTINGS="-Dsettings.common.jmx.port=8100 -Dsettings.common.jmx.rmiPort=8200"
>>>export APP=dk.netarkivet.common.webinterface.GUIApplication
>>>export SETTING="-Dsettings.common.remoteFile.port=5440"
>>>/opt/jdk1.5.0_16/bin/java $JAVA_OPTS $SETTING $LOG_SETTINGS $JMX_SETTINGS
>>>$APP_OPTIONS $APP &
>>>
>>>export JMX_SETTINGS="-Dsettings.common.jmx.port=8120 -Dsettings.common.jmx.rmiPort=8220"
>>>export APP=dk.netarkivet.archive.arcrepository.ArcRepositoryApplication
>>>export SETTING="-Dsettings.common.remoteFile.port=5441"
>>>/opt/jdk1.5.0_16/bin/java $JAVA_OPTS $SETTING $LOG_SETTINGS $JMX_SETTINGS
>>>$APP_OPTIONS $APP &
>>>
>>>export JMX_SETTINGS="-Dsettings.common.jmx.port=8110 -Dsettings.common.jmx.rmiPort=8210"
>>>export APP_OPTIONS="-Dsettings.common.archive.bitarchive.thisLocation=sos
>>>-Dsettings.common.http.port=8081"
>>>export SETTING="-Dsettings.common.remoteFile.port=5443"
>>>export APP=dk.netarkivet.archive.bitarchive.BitarchiveMonitorApplication 
>>>/opt/jdk1.5.0_16/bin/java $JAVA_OPTS $SETTING $LOG_SETTINGS $JMX_SETTINGS
>>>$APP_OPTIONS $APP &
>>>
>>># Harvester machines
>>>
>>>export JMX_SETTINGS="-Dsettings.common.jmx.port=8130 -Dsettings.common.jmx.rmiPort=8230"
>>>export APP_OPTIONS="-Dsettings.harvester.harvesting.queuePriority=HIGHPRIORITY
>>>-Dsettings.common.http.port=8081"
>>>export SETTING="-Dsettings.common.remoteFile.port=5444"
>>>export APP=dk.netarkivet.harvester.harvesting.HarvestControllerApplication
>>>/opt/jdk1.5.0_16/bin/java $JAVA_OPTS $SETTING $LOG_SETTINGS $JMX_SETTINGS
>>>$APP_OPTIONS $APP &
>>>
>>>export JMX_SETTINGS="-Dsettings.common.jmx.port=8140 -Dsettings.common.jmx.rmiPort=8240"
>>>export APP_OPTIONS="-Dsettings.common.http.port=8081"
>>>export APP=dk.netarkivet.harvester.sidekick.SideKick
>>>export APP_ARGS1=dk.netarkivet.harvester.sidekick.HarvestControllerServerMonitorHook
>>>export APP_ARGS2=/home/user/workspace/netarchive/harvester.sh
>>>export SETTING="-Dsettings.common.remoteFile.port=5445"
>>>/opt/jdk1.5.0_16/bin/java $JAVA_OPTS $SETTING $LOG_SETTINGS $JMX_SETTINGS
>>>$APP_OPTIONS $APP $APP_ARGS1 $APP_ARGS2 &
>>>
>>># Access servers
>>>
>>>export JMX_SETTINGS="-Dsettings.common.jmx.port=8160 -Dsettings.common.jmx.rmiPort=8260"
>>>export APP=dk.netarkivet.archive.indexserver.IndexServerApplication
>>>export SETTING="-Dsettings.common.remoteFile.port=5446"
>>>/opt/jdk1.5.0_16/bin/java $JAVA_OPTS $SETTING $LOG_SETTINGS $JMX_SETTINGS $APP
>>>&
>>>
>>>export JMX_SETTINGS="-Dsettings.common.jmx.port=8170 -Dsettings.common.jmx.rmiPort=8270"
>>>export APP_OPTIONS="-Dsettings.common.http.port=8081 -Dsettings.viewerproxy.baseDir=viewerproxy_8081 -Dsettings.archive.bitarchive.thisLocation=sos"
>>>export APP=dk.netarkivet.viewerproxy.ViewerProxyApplication
>>>export SETTING="-Dsettings.common.remoteFile.port=5447"
>>>/opt/jdk1.5.0_16/bin/java $JAVA_OPTS $SETTING $LOG_SETTINGS $JMX_SETTINGS
>>>$APP_OPTIONS $APP &
>>>
>>>Best,
>>>Martin Bella
>>>University Library in Bratislava
>>>
>>>
>>>
>>>>Thanks. This seems to show the problem:
>>>>
>>>>the JMS-queue: DEV_COMMON_THIS_HACO_127_0_1_1_8081 has registered 2
>>>>consumers so another harvester-instance is also running and eating the 
>>>>"Store OK" Messages sent from your ARCRepository.
>>>>
>>>>You should check that you do not have running processes from old
>>>>installations - e.g. use "ps fax | grep java"
>>>>
>>>>The other errors looks like som startup / shutdown problems with
>>>>NetarchiveSuite and the JMS-broker. Make sure that the JMS-broker is 
>>>>running (and cleaned up) before starting any applications. And make sure
>>>>that the JMS-broker is running when you try to stop applications 
>>>>since they will do a clean disconnect from the JMS broker.
>>>>
>>>>best
>>>>Bjarne Andersen
>>>>
>>>>
>>>>Martin Bella wrote:
>>>>
>>>>
>>>>>Hi Bjarne,
>>>>>
>>>>>here is the output of the "mq/bin/imqcmd list dst -u admin -passfile
>>>>>$PASSFILE" command:
>>>>>
>>>>>----------------------------------------------------------------------------------------------------
>>>>>                Name                   Type    State   Producers  Consumers   
>>>>>      Msgs           
>>>>>                                                                    Total   
>>>>>Count  UnAck  Avg Size
>>>>>----------------------------------------------------------------------------------------------------
>>>>>DEV_COMMON_ANY_HIGHPRIORITY_HACO        Queue  RUNNING  1          0          0
>>>>>    0      0.0
>>>>>DEV_COMMON_INDEX_CLIENT_127_0_1_1_8081  Queue  RUNNING  1          1          0
>>>>>    0      0.0
>>>>>DEV_COMMON_INDEX_SERVER                 Queue  RUNNING  1          1          0
>>>>>    0      0.0
>>>>>DEV_COMMON_MONITOR                      Queue  RUNNING  8          1          0
>>>>>    0      0.0
>>>>>DEV_COMMON_THE_ARCREPOS                 Queue  RUNNING  3          1          0
>>>>>    0      0.0
>>>>>DEV_COMMON_THE_SCHED                    Queue  RUNNING  1          1          0
>>>>>    0      0.0
>>>>>DEV_COMMON_THIS_HACO_127_0_1_1_8076     Queue  RUNNING  0          1          0
>>>>>    0      0.0
>>>>>DEV_COMMON_THIS_HACO_127_0_1_1_8081     Queue  RUNNING  1          2          0
>>>>>    0      0.0
>>>>>DEV_sos_ALL_BA                          Topic  RUNNING  1          1          0
>>>>>    0      0.0
>>>>>DEV_sos_ANY_BA                          Queue  RUNNING  1          1          0
>>>>>    0      0.0
>>>>>DEV_sos_THE_BAMON                       Queue  RUNNING  2          1          0
>>>>>    0      0.0
>>>>>mq.sys.dmq                              Queue  RUNNING  0          0          0
>>>>>    0      0.0
>>>>>
>>>>>At present my installation of Netarchive Suite uses only one harvester
>>>>>instance. It did the same thing again - uploaded both arc and metadata.arc
>>>>>file, but reported failure while uploading the arc file.
>>>>>
>>>>>Very fresh installation of Netarchive Suite also produced another error
>>>>>message:
>>>>>
>>>>>Cleaned up dk.netarkivet.common.distribute.HTTPRemoteFileRegistry
>>>>>Cleaning up dk.netarkivet.common.distribute.monitorregistry.JMSMonitorRegistryClient
>>>>>Cleaned up dk.netarkivet.common.distribute.monitorregistry.JMSMonitorRegistryClient
>>>>>Cleaning up dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>>>>Error while cleaning up dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>>>>java.lang.NullPointerException
>>>>>	at dk.netarkivet.common.distribute.JMSConnection.removeListener(JMSConnection.java:640)
>>>>>	at dk.netarkivet.archive.arcrepository.distribute.JMSArcRepositoryClient.close(JMSArcRepositoryClient.java:124)
>>>>>	at dk.netarkivet.harvester.harvesting.HarvestController.cleanup(HarvestController.java:114)
>>>>>	at dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer.cleanup(HarvestControllerServer.java:254)
>>>>>	at dk.netarkivet.common.utils.CleanupHook.run(CleanupHook.java:70)
>>>>>Cleaned up dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>>>>Jul 28, 2008 2:10:49 PM ClientCommunicatorAdmin restart
>>>>>WARNING: Failed to restart: java.io.IOException: Failed to get a RMI stub:
>>>>>javax.naming.ServiceUnavailableException [Root exception is java.rmi.ConnectException: Connection refused to host: ubuntu804desktop.localdomain; nested exception is: 
>>>>>	java.net.ConnectException: Connection refused]
>>>>>Jul 28, 2008 2:10:49 PM RMIConnector RMIClientCommunicatorAdmin-doStop
>>>>>WARNING: Failed to call the method close():java.rmi.ConnectException:
>>>>>Connection refused to host: 127.0.1.1; nested exception is: 
>>>>>	java.net.ConnectException: Connection refused
>>>>>Jul 28, 2008 2:10:49 PM ClientCommunicatorAdmin Checker-run
>>>>>WARNING: Failed to check connection: java.net.ConnectException: Connection
>>>>>refused
>>>>>Jul 28, 2008 2:10:49 PM ClientCommunicatorAdmin Checker-run
>>>>>WARNING: stopping
>>>>>
>>>>>Best,
>>>>>Martin Bella
>>>>>University Library in Bratislava
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>>For me it looks like the ARCRepository does the right thing - uploads the
>>>>>>files AND sends "Store OK" back to the harvester. It seems that 
>>>>>>the harvester does not get that "Store OK" JMS-message - so it times out and
>>>>>>tries the same file again (3 times)
>>>>>>
>>>>>>While uploading could you check how many applications is connected to the
>>>>>>JMS-broker on each JMS-queue. This is done with:
>>>>>>/opt/sun/mq/bin/imqcmd list dst -u admin -passfile $PASSFILE
>>>>>>
>>>>>>where the file pointed at by $PASSFILE should contain one line:
>>>>>>imq.imqcmd.password=WHAT_EVER_YOUR_PASSWORD_IS_SET_TO
>>>>>>
>>>>>>the default password should be: admin - meaning that your $PASSFILE will
>>>>>>have the following line:
>>>>>>imq.imqcmd.password=admin
>>>>>>
>>>>>>The output from imqcmd command should state if the harvester and the
>>>>>>ARCRepository are connected in the right way to the JMS-broker
>>>>>>
>>>>>>best
>>>>>>-- 
>>>>>>Bjarne Andersen
>>>>>>Daily Manager - netarchive.dk
>>>>>>
>>>>>>State & University Library
>>>>>>Universitetsparken
>>>>>>DK-8000 Aarhus C
>>>>>>T: +45 89462165 - C: +45 25662353
>>>>>>CVR/SE 10100682 - EAN 5798000791084
>>>>>>http://netarchive.dk
>>>>>>
>>>>>>Martin Bella wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>>Hi Eld,
>>>>>>>
>>>>>>>sorry for creating another thread. I did not get the answer in email form, but
>>>>>>>I can see it in the "Netarchive-users archives".
>>>>>>>
>>>>>>>Concerning the wrong checksum, you were right, now it works. Thanks.
>>>>>>>Conserning the second problem, I always start JMS as described in the
>>>>>>>Installation manual and I use the latest version of JMS, but the problem
>>>>>>>resists. Is there anything else (logs,...) I can send you to solve
>>>>>>>this?
>>>>>>>
>>>>>>>Best,
>>>>>>>Martin Bella
>>>>>>>University Library in Bratislava
>>>>>>>_______________________________________________
>>>>>>>NetarchiveSuite-users mailing list
>>>>>>>NetarchiveSuite-users at lists.gforge.statsbiblioteket.dk
>>>>>>>https://lists.gforge.statsbiblioteket.dk/mailman/listinfo/netarchivesuite-users
>>>>>>
>>>>>_______________________________________________
>>>>>NetarchiveSuite-users mailing list
>>>>>NetarchiveSuite-users at lists.gforge.statsbiblioteket.dk
>>>>>https://lists.gforge.statsbiblioteket.dk/mailman/listinfo/netarchivesuite-users
>>>>
>>>>-- 
>>>>Bjarne Andersen
>>>>Driftsleder - netarkivet.dk
>>>>
>>>>Statsbiblioteket
>>>>Universitetsparken
>>>>8000 Århus C
>>>>Tlf. 89462165 - Mobil 25662353
>>>>CVR/SE 10100682 - EAN 5798000791084
>>>>http://netarkivet.dk
>>>
>>>_______________________________________________
>>>NetarchiveSuite-users mailing list
>>>NetarchiveSuite-users at lists.gforge.statsbiblioteket.dk
>>>https://lists.gforge.statsbiblioteket.dk/mailman/listinfo/netarchivesuite-users
>>
>>-- 
>>Bjarne Andersen
>>Driftsleder - netarkivet.dk
>>
>>Statsbiblioteket
>>Universitetsparken
>>8000 Århus C
>>Tlf. 89462165 - Mobil 25662353
>>CVR/SE 10100682 - EAN 5798000791084
>>http://netarkivet.dk
> 
> _______________________________________________
> NetarchiveSuite-users mailing list
> NetarchiveSuite-users at lists.gforge.statsbiblioteket.dk
> https://lists.gforge.statsbiblioteket.dk/mailman/listinfo/netarchivesuite-users

-- 
Bjarne Andersen
Driftsleder - netarkivet.dk

Statsbiblioteket
Universitetsparken
8000 Århus C
Tlf. 89462165 - Mobil 25662353
CVR/SE 10100682 - EAN 5798000791084
http://netarkivet.dk
-------------- next part --------------
A non-text attachment was scrubbed...
Name: netarkivet.vcf
Type: text/x-vcard
Size: 312 bytes
Desc: not available
URL: <http://ml.sbforge.org/pipermail/netarchivesuite-users/attachments/20080730/cedfa48b/attachment-0002.vcf>


More information about the NetarchiveSuite-users mailing list