[Netarchivesuite-users] more upload errors

Martin Bella 98989 at mail.muni.cz
Wed Jul 30 12:20:57 CEST 2008


Thank you for your help. Now it seems to work.

Best,
Martin Bella
University Library in Slovakia

2008/7/30 Bjarne Andersen <netarkivet at statsbiblioteket.dk>

> The problem was the viewerproxy application.
>
> Each application on one server needs a unique port-number in the setting:
> -Dsettings.common.http.port=
>   - This is because of several reasons:
>     a) The portnumber is used in the JMS-queue naming
>     b) The portnumber is used to LISTEN for server-applications
> (GUIApplication and ViewerproxyApplication)
>
> Your problem was that both your harvester and your viewerproxy had the same
> -Dsettings.common.http.port=8081.
>
> A bug (in my opinion) is that 3 different applications use
> "COMMON_THIS_HACO_" in the naming of their JMS-queue:
>  - Harvester (seems OK - HACO meens HarvesterController)
>  - Viewerproxy (NOT OK - could be something like VIEW)
>  - Indexserver (NOT OK - could be something like INDX)
>
>
> best
> Bjarne Andersen
>
> Martin Bella wrote:
>
>> I thought it was not necessary to post how harvester.sh script looks like.
>> The
>> first reason is, it was created by copy&paste method and it contains the
>> same
>> lines for starting the HarvesterControllerApplication as the netarchive.sh
>> script. Probably I could do it the same way as you. I will play with this
>> later. The second reason is, when I make a change, I almost always start
>> with a
>> clean install. And it fails.
>>
>> If you are interested, I think in our organisation we could create a
>> special
>> limited account for you on one of our testing machines so that you could
>> have a
>> closer look at this problem.
>>
>> Best,
>> Martin Bella
>> University Library in Bratislava
>>
>>
>>  It might be a problem with the HarvesterControllerApplication and the
>>> SideKick.
>>>
>>> The HarvesterControllerApplication is implemented in such way that it
>>> destroys itself after it has finished a job. The SideKick ensures to
>>> restart that specific harvester again. So each HarvesterControllerServer has
>>> a corrosponding SideKick.
>>>
>>> The SideKick must know of a script to restart the harvester - you have to
>>> make sure that the way you start the harvester in your init-script is the
>>> same way the SideKick starts it after each job - preferable by
>>> calling the same script starting the HarvesterControllerServer.
>>>
>>> So you might want to put the startup of the HarvesterControllerServer in
>>> its own little script and call that from your init script (remember your
>>> global set variables)
>>>
>>> In our startup-scripts (automatically generated with the
>>> DeployApplication)
>>> one specific harvester has the following start-script:
>>> #!/bin/bash
>>> export
>>> CLASSPATH=/home/prod/PROD/lib/dk.netarkivet.harvester.jar:/home/prod/PROD/lib/dk.netarkivet.archive.jar:/home/prod/PROD/lib/dk.netarkivet.viewerproxy.jar:/home/prod/PROD/lib/dk.netarkivet.monitor.jar:$CLASSPATH;
>>> cd /home/prod/PROD
>>> java -Xmx1536m  -Dsettings.harvester.harvesting.heritrix.guiPort=8095
>>> -Dsettings.harvester.harvesting.heritrix.jmxPort=8195
>>> -Ddk.netarkivet.settings.file=/home/prod/PROD/conf/settings_harvester_8081.xml
>>> -Dorg.apache.commons.logging.Log=org.apache.commons.logging.impl.Jdk14Logger
>>> -Djava.util.logging.config.file=/home/prod/PROD/conf/log_harvestcontrollerapplication.prop
>>> -Dsettings.common.jmx.port=8110 -Dsettings.common.jmx.rmiPort=8210
>>> -Dsettings.common.jmx.passwordFile=/home/prod/PROD/conf/jmxremote.password
>>> -Djava.security.manager
>>> -Djava.security.policy=/home/prod/PROD/conf/security.policy
>>> dk.netarkivet.harvester.harvesting.HarvestControllerApplication < /dev/null
>>>
>>> start_harvester_8081.sh.log 2>&1 &
>>>
>>> And the matching SideKick has the following startup-script:
>>> java -Xmx1536m
>>> -Ddk.netarkivet.settings.file=/home/prod/PROD/conf/settings_harvester_8081.xml
>>> -Dorg.apache.commons.logging.Log=org.apache.commons.logging.impl.Jdk14Logger
>>> -Djava.util.logging.config.file=/home/prod/PROD/conf/log_sidekick.prop
>>> -Dsettings.common.jmx.port=8111 -Dsettings.common.jmx.rmiPort=8211
>>> -Dsettings.common.jmx.passwordFile=/home/prod/PROD/conf/jmxremote.password
>>> -Djava.security.manager
>>> -Djava.security.policy=/home/prod/PROD/conf/security.policy
>>> dk.netarkivet.harvester.sidekick.SideKick
>>> dk.netarkivet.harvester.sidekick.HarvestControllerServerMonitorHook
>>> ./conf/start_harvester_8081.sh  < /dev/null > start_sidekick_8081.sh.log
>>> 2>&1 &
>>>
>>> You can see that the SideKick has an argument wich is exactly the
>>> startup-script of the HarvesterControllerServer.
>>>
>>> I can see that your init-script references
>>> /home/user/workspace/netarchive/harvester.sh - put the startup of the
>>> Harvester in that file.
>>>
>>> Even given your current setup I would imagine that if things should fail
>>> it
>>> sould fail not until the second job is run because of the potential
>>> failed restart of the HarvesterControllerServer.
>>>
>>> So try with a clean install - clean JMS-broker and see the imqcmd output
>>> before doing any jobs - should only have one consumer on the Harvester
>>> queue
>>>
>>> best
>>> Bjarne Andersen
>>>
>>> Martin Bella wrote:
>>>
>>>  I do not think I have any java processes from old installations. At the
>>>> time of
>>>> uploading the ps command showed only one instance of JMS Broker,
>>>> BitarchiveApplication, GUIApplication, ArcRepositoryApplication,
>>>> BitarchiveMonitorApplication, HarvestControllerApplication, SideKick,
>>>> IndexServerApplication and ViewerProxyApplication.
>>>>
>>>> Btw. here is, how my init script on my testing machine looks like:
>>>>
>>>> #!/bin/bash
>>>>
>>>> export NetarchiveDir=/home/user/workspace/netarchive
>>>> export
>>>> CLASSPATH=$CLASSPATH:$NetarchiveDir/lib/dk.netarkivet.harvester.jar
>>>> export CLASSPATH=$CLASSPATH:$NetarchiveDir/lib/dk.netarkivet.archive.jar
>>>> export
>>>> CLASSPATH=$CLASSPATH:$NetarchiveDir/lib/dk.netarkivet.viewerproxy.jar
>>>> export CLASSPATH=$CLASSPATH:$NetarchiveDir/lib/dk.netarkivet.monitor.jar
>>>> export JAVA_OPTS=-Xmx2048m
>>>> export
>>>> LOG_SETTINGS="-Dorg.apache.commons.logging.Log=org.apache.commons.logging.impl.Jdk14Logger
>>>> -Djava.util.logging.config.file=$NetarchiveDir/conf/log.prop"
>>>>
>>>> cd $NetarchiveDir
>>>>
>>>> # Bitarchive machines
>>>>
>>>> export JMX_SETTINGS="-Dsettings.common.jmx.port=8150
>>>> -Dsettings.common.jmx.rmiPort=8250"
>>>> export APP_OPTIONS="-Dsettings.archive.bitarchive.thisLocation=sos
>>>> -Dsettings.archive.bitarchive.thisCredentials=examplecredentials"
>>>> export APP=dk.netarkivet.archive.bitarchive.BitarchiveApplication
>>>> /opt/jdk1.5.0_16/bin/java $JAVA_OPTS $SETTING $LOG_SETTINGS
>>>> $JMX_SETTINGS
>>>> $APP_OPTIONS $APP &
>>>>
>>>> # Admin machine
>>>>
>>>> export JMX_SETTINGS="-Dsettings.common.jmx.port=8100
>>>> -Dsettings.common.jmx.rmiPort=8200"
>>>> export APP=dk.netarkivet.common.webinterface.GUIApplication
>>>> export SETTING="-Dsettings.common.remoteFile.port=5440"
>>>> /opt/jdk1.5.0_16/bin/java $JAVA_OPTS $SETTING $LOG_SETTINGS
>>>> $JMX_SETTINGS
>>>> $APP_OPTIONS $APP &
>>>>
>>>> export JMX_SETTINGS="-Dsettings.common.jmx.port=8120
>>>> -Dsettings.common.jmx.rmiPort=8220"
>>>> export APP=dk.netarkivet.archive.arcrepository.ArcRepositoryApplication
>>>> export SETTING="-Dsettings.common.remoteFile.port=5441"
>>>> /opt/jdk1.5.0_16/bin/java $JAVA_OPTS $SETTING $LOG_SETTINGS
>>>> $JMX_SETTINGS
>>>> $APP_OPTIONS $APP &
>>>>
>>>> export JMX_SETTINGS="-Dsettings.common.jmx.port=8110
>>>> -Dsettings.common.jmx.rmiPort=8210"
>>>> export
>>>> APP_OPTIONS="-Dsettings.common.archive.bitarchive.thisLocation=sos
>>>> -Dsettings.common.http.port=8081"
>>>> export SETTING="-Dsettings.common.remoteFile.port=5443"
>>>> export APP=dk.netarkivet.archive.bitarchive.BitarchiveMonitorApplication
>>>> /opt/jdk1.5.0_16/bin/java $JAVA_OPTS $SETTING $LOG_SETTINGS $JMX_SETTINGS
>>>> $APP_OPTIONS $APP &
>>>>
>>>> # Harvester machines
>>>>
>>>> export JMX_SETTINGS="-Dsettings.common.jmx.port=8130
>>>> -Dsettings.common.jmx.rmiPort=8230"
>>>> export
>>>> APP_OPTIONS="-Dsettings.harvester.harvesting.queuePriority=HIGHPRIORITY
>>>> -Dsettings.common.http.port=8081"
>>>> export SETTING="-Dsettings.common.remoteFile.port=5444"
>>>> export
>>>> APP=dk.netarkivet.harvester.harvesting.HarvestControllerApplication
>>>> /opt/jdk1.5.0_16/bin/java $JAVA_OPTS $SETTING $LOG_SETTINGS
>>>> $JMX_SETTINGS
>>>> $APP_OPTIONS $APP &
>>>>
>>>> export JMX_SETTINGS="-Dsettings.common.jmx.port=8140
>>>> -Dsettings.common.jmx.rmiPort=8240"
>>>> export APP_OPTIONS="-Dsettings.common.http.port=8081"
>>>> export APP=dk.netarkivet.harvester.sidekick.SideKick
>>>> export
>>>> APP_ARGS1=dk.netarkivet.harvester.sidekick.HarvestControllerServerMonitorHook
>>>> export APP_ARGS2=/home/user/workspace/netarchive/harvester.sh
>>>> export SETTING="-Dsettings.common.remoteFile.port=5445"
>>>> /opt/jdk1.5.0_16/bin/java $JAVA_OPTS $SETTING $LOG_SETTINGS
>>>> $JMX_SETTINGS
>>>> $APP_OPTIONS $APP $APP_ARGS1 $APP_ARGS2 &
>>>>
>>>> # Access servers
>>>>
>>>> export JMX_SETTINGS="-Dsettings.common.jmx.port=8160
>>>> -Dsettings.common.jmx.rmiPort=8260"
>>>> export APP=dk.netarkivet.archive.indexserver.IndexServerApplication
>>>> export SETTING="-Dsettings.common.remoteFile.port=5446"
>>>> /opt/jdk1.5.0_16/bin/java $JAVA_OPTS $SETTING $LOG_SETTINGS
>>>> $JMX_SETTINGS $APP
>>>> &
>>>>
>>>> export JMX_SETTINGS="-Dsettings.common.jmx.port=8170
>>>> -Dsettings.common.jmx.rmiPort=8270"
>>>> export APP_OPTIONS="-Dsettings.common.http.port=8081
>>>> -Dsettings.viewerproxy.baseDir=viewerproxy_8081
>>>> -Dsettings.archive.bitarchive.thisLocation=sos"
>>>> export APP=dk.netarkivet.viewerproxy.ViewerProxyApplication
>>>> export SETTING="-Dsettings.common.remoteFile.port=5447"
>>>> /opt/jdk1.5.0_16/bin/java $JAVA_OPTS $SETTING $LOG_SETTINGS
>>>> $JMX_SETTINGS
>>>> $APP_OPTIONS $APP &
>>>>
>>>> Best,
>>>> Martin Bella
>>>> University Library in Bratislava
>>>>
>>>>
>>>>
>>>>  Thanks. This seems to show the problem:
>>>>>
>>>>> the JMS-queue: DEV_COMMON_THIS_HACO_127_0_1_1_8081 has registered 2
>>>>> consumers so another harvester-instance is also running and eating the
>>>>> "Store OK" Messages sent from your ARCRepository.
>>>>>
>>>>> You should check that you do not have running processes from old
>>>>> installations - e.g. use "ps fax | grep java"
>>>>>
>>>>> The other errors looks like som startup / shutdown problems with
>>>>> NetarchiveSuite and the JMS-broker. Make sure that the JMS-broker is
>>>>> running (and cleaned up) before starting any applications. And make sure
>>>>> that the JMS-broker is running when you try to stop applications since
>>>>> they will do a clean disconnect from the JMS broker.
>>>>>
>>>>> best
>>>>> Bjarne Andersen
>>>>>
>>>>>
>>>>> Martin Bella wrote:
>>>>>
>>>>>
>>>>>  Hi Bjarne,
>>>>>>
>>>>>> here is the output of the "mq/bin/imqcmd list dst -u admin -passfile
>>>>>> $PASSFILE" command:
>>>>>>
>>>>>>
>>>>>> ----------------------------------------------------------------------------------------------------
>>>>>>               Name                   Type    State   Producers
>>>>>>  Consumers       Msgs
>>>>>>                       Total   Count  UnAck  Avg Size
>>>>>>
>>>>>> ----------------------------------------------------------------------------------------------------
>>>>>> DEV_COMMON_ANY_HIGHPRIORITY_HACO        Queue  RUNNING  1          0
>>>>>>        0
>>>>>>   0      0.0
>>>>>> DEV_COMMON_INDEX_CLIENT_127_0_1_1_8081  Queue  RUNNING  1          1
>>>>>>        0
>>>>>>   0      0.0
>>>>>> DEV_COMMON_INDEX_SERVER                 Queue  RUNNING  1          1
>>>>>>        0
>>>>>>   0      0.0
>>>>>> DEV_COMMON_MONITOR                      Queue  RUNNING  8          1
>>>>>>        0
>>>>>>   0      0.0
>>>>>> DEV_COMMON_THE_ARCREPOS                 Queue  RUNNING  3          1
>>>>>>        0
>>>>>>   0      0.0
>>>>>> DEV_COMMON_THE_SCHED                    Queue  RUNNING  1          1
>>>>>>        0
>>>>>>   0      0.0
>>>>>> DEV_COMMON_THIS_HACO_127_0_1_1_8076     Queue  RUNNING  0          1
>>>>>>        0
>>>>>>   0      0.0
>>>>>> DEV_COMMON_THIS_HACO_127_0_1_1_8081     Queue  RUNNING  1          2
>>>>>>        0
>>>>>>   0      0.0
>>>>>> DEV_sos_ALL_BA                          Topic  RUNNING  1          1
>>>>>>        0
>>>>>>   0      0.0
>>>>>> DEV_sos_ANY_BA                          Queue  RUNNING  1          1
>>>>>>        0
>>>>>>   0      0.0
>>>>>> DEV_sos_THE_BAMON                       Queue  RUNNING  2          1
>>>>>>        0
>>>>>>   0      0.0
>>>>>> mq.sys.dmq                              Queue  RUNNING  0          0
>>>>>>        0
>>>>>>   0      0.0
>>>>>>
>>>>>> At present my installation of Netarchive Suite uses only one harvester
>>>>>> instance. It did the same thing again - uploaded both arc and
>>>>>> metadata.arc
>>>>>> file, but reported failure while uploading the arc file.
>>>>>>
>>>>>> Very fresh installation of Netarchive Suite also produced another
>>>>>> error
>>>>>> message:
>>>>>>
>>>>>> Cleaned up dk.netarkivet.common.distribute.HTTPRemoteFileRegistry
>>>>>> Cleaning up
>>>>>> dk.netarkivet.common.distribute.monitorregistry.JMSMonitorRegistryClient
>>>>>> Cleaned up
>>>>>> dk.netarkivet.common.distribute.monitorregistry.JMSMonitorRegistryClient
>>>>>> Cleaning up
>>>>>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>>>>> Error while cleaning up
>>>>>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>>>>> java.lang.NullPointerException
>>>>>>        at
>>>>>> dk.netarkivet.common.distribute.JMSConnection.removeListener(JMSConnection.java:640)
>>>>>>        at
>>>>>> dk.netarkivet.archive.arcrepository.distribute.JMSArcRepositoryClient.close(JMSArcRepositoryClient.java:124)
>>>>>>        at
>>>>>> dk.netarkivet.harvester.harvesting.HarvestController.cleanup(HarvestController.java:114)
>>>>>>        at
>>>>>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer.cleanup(HarvestControllerServer.java:254)
>>>>>>        at
>>>>>> dk.netarkivet.common.utils.CleanupHook.run(CleanupHook.java:70)
>>>>>> Cleaned up
>>>>>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>>>>> Jul 28, 2008 2:10:49 PM ClientCommunicatorAdmin restart
>>>>>> WARNING: Failed to restart: java.io.IOException: Failed to get a RMI
>>>>>> stub:
>>>>>> javax.naming.ServiceUnavailableException [Root exception is
>>>>>> java.rmi.ConnectException: Connection refused to host:
>>>>>> ubuntu804desktop.localdomain; nested exception is:
>>>>>>  java.net.ConnectException: Connection refused]
>>>>>> Jul 28, 2008 2:10:49 PM RMIConnector RMIClientCommunicatorAdmin-doStop
>>>>>> WARNING: Failed to call the method close():java.rmi.ConnectException:
>>>>>> Connection refused to host: 127.0.1.1; nested exception is:
>>>>>>  java.net.ConnectException: Connection refused
>>>>>> Jul 28, 2008 2:10:49 PM ClientCommunicatorAdmin Checker-run
>>>>>> WARNING: Failed to check connection: java.net.ConnectException:
>>>>>> Connection
>>>>>> refused
>>>>>> Jul 28, 2008 2:10:49 PM ClientCommunicatorAdmin Checker-run
>>>>>> WARNING: stopping
>>>>>>
>>>>>> Best,
>>>>>> Martin Bella
>>>>>> University Library in Bratislava
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>  For me it looks like the ARCRepository does the right thing - uploads
>>>>>>> the
>>>>>>> files AND sends "Store OK" back to the harvester. It seems that the
>>>>>>> harvester does not get that "Store OK" JMS-message - so it times out and
>>>>>>> tries the same file again (3 times)
>>>>>>>
>>>>>>> While uploading could you check how many applications is connected to
>>>>>>> the
>>>>>>> JMS-broker on each JMS-queue. This is done with:
>>>>>>> /opt/sun/mq/bin/imqcmd list dst -u admin -passfile $PASSFILE
>>>>>>>
>>>>>>> where the file pointed at by $PASSFILE should contain one line:
>>>>>>> imq.imqcmd.password=WHAT_EVER_YOUR_PASSWORD_IS_SET_TO
>>>>>>>
>>>>>>> the default password should be: admin - meaning that your $PASSFILE
>>>>>>> will
>>>>>>> have the following line:
>>>>>>> imq.imqcmd.password=admin
>>>>>>>
>>>>>>> The output from imqcmd command should state if the harvester and the
>>>>>>> ARCRepository are connected in the right way to the JMS-broker
>>>>>>>
>>>>>>> best
>>>>>>> --
>>>>>>> Bjarne Andersen
>>>>>>> Daily Manager - netarchive.dk
>>>>>>>
>>>>>>> State & University Library
>>>>>>> Universitetsparken
>>>>>>> DK-8000 Aarhus C
>>>>>>> T: +45 89462165 - C: +45 25662353
>>>>>>> CVR/SE 10100682 - EAN 5798000791084
>>>>>>> http://netarchive.dk
>>>>>>>
>>>>>>> Martin Bella wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>  Hi Eld,
>>>>>>>>
>>>>>>>> sorry for creating another thread. I did not get the answer in email
>>>>>>>> form, but
>>>>>>>> I can see it in the "Netarchive-users archives".
>>>>>>>>
>>>>>>>> Concerning the wrong checksum, you were right, now it works. Thanks.
>>>>>>>> Conserning the second problem, I always start JMS as described in
>>>>>>>> the
>>>>>>>> Installation manual and I use the latest version of JMS, but the
>>>>>>>> problem
>>>>>>>> resists. Is there anything else (logs,...) I can send you to solve
>>>>>>>> this?
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Martin Bella
>>>>>>>> University Library in Bratislava
>>>>>>>> _______________________________________________
>>>>>>>> NetarchiveSuite-users mailing list
>>>>>>>> NetarchiveSuite-users at lists.gforge.statsbiblioteket.dk
>>>>>>>>
>>>>>>>> https://lists.gforge.statsbiblioteket.dk/mailman/listinfo/netarchivesuite-users
>>>>>>>>
>>>>>>>
>>>>>>>  _______________________________________________
>>>>>> NetarchiveSuite-users mailing list
>>>>>> NetarchiveSuite-users at lists.gforge.statsbiblioteket.dk
>>>>>>
>>>>>> https://lists.gforge.statsbiblioteket.dk/mailman/listinfo/netarchivesuite-users
>>>>>>
>>>>>
>>>>> --
>>>>> Bjarne Andersen
>>>>> Driftsleder - netarkivet.dk
>>>>>
>>>>> Statsbiblioteket
>>>>> Universitetsparken
>>>>> 8000 Århus C
>>>>> Tlf. 89462165 - Mobil 25662353
>>>>> CVR/SE 10100682 - EAN 5798000791084
>>>>> http://netarkivet.dk
>>>>>
>>>>
>>>> _______________________________________________
>>>> NetarchiveSuite-users mailing list
>>>> NetarchiveSuite-users at lists.gforge.statsbiblioteket.dk
>>>>
>>>> https://lists.gforge.statsbiblioteket.dk/mailman/listinfo/netarchivesuite-users
>>>>
>>>
>>> --
>>> Bjarne Andersen
>>> Driftsleder - netarkivet.dk
>>>
>>> Statsbiblioteket
>>> Universitetsparken
>>> 8000 Århus C
>>> Tlf. 89462165 - Mobil 25662353
>>> CVR/SE 10100682 - EAN 5798000791084
>>> http://netarkivet.dk
>>>
>>
>> _______________________________________________
>> NetarchiveSuite-users mailing list
>> NetarchiveSuite-users at lists.gforge.statsbiblioteket.dk
>>
>> https://lists.gforge.statsbiblioteket.dk/mailman/listinfo/netarchivesuite-users
>>
>
> --
> Bjarne Andersen
> Driftsleder - netarkivet.dk
>
> Statsbiblioteket
> Universitetsparken
> 8000 Århus C
> Tlf. 89462165 - Mobil 25662353
> CVR/SE 10100682 - EAN 5798000791084
> http://netarkivet.dk
>
> _______________________________________________
> NetarchiveSuite-users mailing list
> NetarchiveSuite-users at lists.gforge.statsbiblioteket.dk
>
> https://lists.gforge.statsbiblioteket.dk/mailman/listinfo/netarchivesuite-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://ml.sbforge.org/pipermail/netarchivesuite-users/attachments/20080730/2617fe63/attachment-0002.html>


More information about the NetarchiveSuite-users mailing list