[Netarchivesuite-users] more upload errors
Martin Bella
98989 at mail.muni.cz
Mon Jul 28 22:38:40 CEST 2008
I thought it was not necessary to post how harvester.sh script looks like. The
first reason is, it was created by copy&paste method and it contains the same
lines for starting the HarvesterControllerApplication as the netarchive.sh
script. Probably I could do it the same way as you. I will play with this
later. The second reason is, when I make a change, I almost always start with a
clean install. And it fails.
If you are interested, I think in our organisation we could create a special
limited account for you on one of our testing machines so that you could have a
closer look at this problem.
Best,
Martin Bella
University Library in Bratislava
> It might be a problem with the HarvesterControllerApplication and the
> SideKick.
>
> The HarvesterControllerApplication is implemented in such way that it
> destroys itself after it has finished a job. The SideKick ensures to
> restart that specific harvester again. So each HarvesterControllerServer has
> a corrosponding SideKick.
>
> The SideKick must know of a script to restart the harvester - you have to
> make sure that the way you start the harvester in your init-script
> is the same way the SideKick starts it after each job - preferable by
> calling the same script starting the HarvesterControllerServer.
>
> So you might want to put the startup of the HarvesterControllerServer in
> its own little script and call that from your init script (remember
> your global set variables)
>
> In our startup-scripts (automatically generated with the DeployApplication)
> one specific harvester has the following start-script:
> #!/bin/bash
> export
> CLASSPATH=/home/prod/PROD/lib/dk.netarkivet.harvester.jar:/home/prod/PROD/lib/dk.netarkivet.archive.jar:/home/prod/PROD/lib/dk.netarkivet.viewerproxy.jar:/home/prod/PROD/lib/dk.netarkivet.monitor.jar:$CLASSPATH;
> cd /home/prod/PROD
> java -Xmx1536m -Dsettings.harvester.harvesting.heritrix.guiPort=8095
> -Dsettings.harvester.harvesting.heritrix.jmxPort=8195
> -Ddk.netarkivet.settings.file=/home/prod/PROD/conf/settings_harvester_8081.xml
> -Dorg.apache.commons.logging.Log=org.apache.commons.logging.impl.Jdk14Logger
> -Djava.util.logging.config.file=/home/prod/PROD/conf/log_harvestcontrollerapplication.prop -Dsettings.common.jmx.port=8110
> -Dsettings.common.jmx.rmiPort=8210 -Dsettings.common.jmx.passwordFile=/home/prod/PROD/conf/jmxremote.password -Djava.security.manager
> -Djava.security.policy=/home/prod/PROD/conf/security.policy
> dk.netarkivet.harvester.harvesting.HarvestControllerApplication < /dev/null
> >
> start_harvester_8081.sh.log 2>&1 &
>
> And the matching SideKick has the following startup-script:
> java -Xmx1536m -Ddk.netarkivet.settings.file=/home/prod/PROD/conf/settings_harvester_8081.xml
> -Dorg.apache.commons.logging.Log=org.apache.commons.logging.impl.Jdk14Logger
> -Djava.util.logging.config.file=/home/prod/PROD/conf/log_sidekick.prop
> -Dsettings.common.jmx.port=8111 -Dsettings.common.jmx.rmiPort=8211
> -Dsettings.common.jmx.passwordFile=/home/prod/PROD/conf/jmxremote.password
> -Djava.security.manager
> -Djava.security.policy=/home/prod/PROD/conf/security.policy
> dk.netarkivet.harvester.sidekick.SideKick
> dk.netarkivet.harvester.sidekick.HarvestControllerServerMonitorHook
> ./conf/start_harvester_8081.sh < /dev/null > start_sidekick_8081.sh.log
> 2>&1 &
>
> You can see that the SideKick has an argument wich is exactly the
> startup-script of the HarvesterControllerServer.
>
> I can see that your init-script references /home/user/workspace/netarchive/harvester.sh - put the startup of the Harvester in that file.
>
> Even given your current setup I would imagine that if things should fail it
> sould fail not until the second job is run because of the
> potential failed restart of the HarvesterControllerServer.
>
> So try with a clean install - clean JMS-broker and see the imqcmd output
> before doing any jobs - should only have one consumer on the
> Harvester queue
>
> best
> Bjarne Andersen
>
> Martin Bella wrote:
> > I do not think I have any java processes from old installations. At the time of
> > uploading the ps command showed only one instance of JMS Broker, BitarchiveApplication, GUIApplication, ArcRepositoryApplication, BitarchiveMonitorApplication, HarvestControllerApplication, SideKick, IndexServerApplication and ViewerProxyApplication.
> >
> > Btw. here is, how my init script on my testing machine looks like:
> >
> > #!/bin/bash
> >
> > export NetarchiveDir=/home/user/workspace/netarchive
> > export CLASSPATH=$CLASSPATH:$NetarchiveDir/lib/dk.netarkivet.harvester.jar
> > export CLASSPATH=$CLASSPATH:$NetarchiveDir/lib/dk.netarkivet.archive.jar
> > export CLASSPATH=$CLASSPATH:$NetarchiveDir/lib/dk.netarkivet.viewerproxy.jar
> > export CLASSPATH=$CLASSPATH:$NetarchiveDir/lib/dk.netarkivet.monitor.jar
> > export JAVA_OPTS=-Xmx2048m
> > export LOG_SETTINGS="-Dorg.apache.commons.logging.Log=org.apache.commons.logging.impl.Jdk14Logger -Djava.util.logging.config.file=$NetarchiveDir/conf/log.prop"
> >
> > cd $NetarchiveDir
> >
> > # Bitarchive machines
> >
> > export JMX_SETTINGS="-Dsettings.common.jmx.port=8150 -Dsettings.common.jmx.rmiPort=8250"
> > export APP_OPTIONS="-Dsettings.archive.bitarchive.thisLocation=sos
> > -Dsettings.archive.bitarchive.thisCredentials=examplecredentials"
> > export APP=dk.netarkivet.archive.bitarchive.BitarchiveApplication
> > /opt/jdk1.5.0_16/bin/java $JAVA_OPTS $SETTING $LOG_SETTINGS $JMX_SETTINGS
> > $APP_OPTIONS $APP &
> >
> > # Admin machine
> >
> > export JMX_SETTINGS="-Dsettings.common.jmx.port=8100 -Dsettings.common.jmx.rmiPort=8200"
> > export APP=dk.netarkivet.common.webinterface.GUIApplication
> > export SETTING="-Dsettings.common.remoteFile.port=5440"
> > /opt/jdk1.5.0_16/bin/java $JAVA_OPTS $SETTING $LOG_SETTINGS $JMX_SETTINGS
> > $APP_OPTIONS $APP &
> >
> > export JMX_SETTINGS="-Dsettings.common.jmx.port=8120 -Dsettings.common.jmx.rmiPort=8220"
> > export APP=dk.netarkivet.archive.arcrepository.ArcRepositoryApplication
> > export SETTING="-Dsettings.common.remoteFile.port=5441"
> > /opt/jdk1.5.0_16/bin/java $JAVA_OPTS $SETTING $LOG_SETTINGS $JMX_SETTINGS
> > $APP_OPTIONS $APP &
> >
> > export JMX_SETTINGS="-Dsettings.common.jmx.port=8110 -Dsettings.common.jmx.rmiPort=8210"
> > export APP_OPTIONS="-Dsettings.common.archive.bitarchive.thisLocation=sos
> > -Dsettings.common.http.port=8081"
> > export SETTING="-Dsettings.common.remoteFile.port=5443"
> > export APP=dk.netarkivet.archive.bitarchive.BitarchiveMonitorApplication
> > /opt/jdk1.5.0_16/bin/java $JAVA_OPTS $SETTING $LOG_SETTINGS $JMX_SETTINGS
> > $APP_OPTIONS $APP &
> >
> > # Harvester machines
> >
> > export JMX_SETTINGS="-Dsettings.common.jmx.port=8130 -Dsettings.common.jmx.rmiPort=8230"
> > export APP_OPTIONS="-Dsettings.harvester.harvesting.queuePriority=HIGHPRIORITY
> > -Dsettings.common.http.port=8081"
> > export SETTING="-Dsettings.common.remoteFile.port=5444"
> > export APP=dk.netarkivet.harvester.harvesting.HarvestControllerApplication
> > /opt/jdk1.5.0_16/bin/java $JAVA_OPTS $SETTING $LOG_SETTINGS $JMX_SETTINGS
> > $APP_OPTIONS $APP &
> >
> > export JMX_SETTINGS="-Dsettings.common.jmx.port=8140 -Dsettings.common.jmx.rmiPort=8240"
> > export APP_OPTIONS="-Dsettings.common.http.port=8081"
> > export APP=dk.netarkivet.harvester.sidekick.SideKick
> > export APP_ARGS1=dk.netarkivet.harvester.sidekick.HarvestControllerServerMonitorHook
> > export APP_ARGS2=/home/user/workspace/netarchive/harvester.sh
> > export SETTING="-Dsettings.common.remoteFile.port=5445"
> > /opt/jdk1.5.0_16/bin/java $JAVA_OPTS $SETTING $LOG_SETTINGS $JMX_SETTINGS
> > $APP_OPTIONS $APP $APP_ARGS1 $APP_ARGS2 &
> >
> > # Access servers
> >
> > export JMX_SETTINGS="-Dsettings.common.jmx.port=8160 -Dsettings.common.jmx.rmiPort=8260"
> > export APP=dk.netarkivet.archive.indexserver.IndexServerApplication
> > export SETTING="-Dsettings.common.remoteFile.port=5446"
> > /opt/jdk1.5.0_16/bin/java $JAVA_OPTS $SETTING $LOG_SETTINGS $JMX_SETTINGS $APP
> > &
> >
> > export JMX_SETTINGS="-Dsettings.common.jmx.port=8170 -Dsettings.common.jmx.rmiPort=8270"
> > export APP_OPTIONS="-Dsettings.common.http.port=8081 -Dsettings.viewerproxy.baseDir=viewerproxy_8081 -Dsettings.archive.bitarchive.thisLocation=sos"
> > export APP=dk.netarkivet.viewerproxy.ViewerProxyApplication
> > export SETTING="-Dsettings.common.remoteFile.port=5447"
> > /opt/jdk1.5.0_16/bin/java $JAVA_OPTS $SETTING $LOG_SETTINGS $JMX_SETTINGS
> > $APP_OPTIONS $APP &
> >
> > Best,
> > Martin Bella
> > University Library in Bratislava
> >
> >
> >>Thanks. This seems to show the problem:
> >>
> >>the JMS-queue: DEV_COMMON_THIS_HACO_127_0_1_1_8081 has registered 2
> >>consumers so another harvester-instance is also running and eating the
> >>"Store OK" Messages sent from your ARCRepository.
> >>
> >>You should check that you do not have running processes from old
> >>installations - e.g. use "ps fax | grep java"
> >>
> >>The other errors looks like som startup / shutdown problems with
> >>NetarchiveSuite and the JMS-broker. Make sure that the JMS-broker is
> >>running (and cleaned up) before starting any applications. And make sure
> >>that the JMS-broker is running when you try to stop applications
> >>since they will do a clean disconnect from the JMS broker.
> >>
> >>best
> >>Bjarne Andersen
> >>
> >>
> >>Martin Bella wrote:
> >>
> >>>Hi Bjarne,
> >>>
> >>>here is the output of the "mq/bin/imqcmd list dst -u admin -passfile
> >>>$PASSFILE" command:
> >>>
> >>>----------------------------------------------------------------------------------------------------
> >>> Name Type State Producers Consumers
> >>> Msgs
> >>> Total
> >>>Count UnAck Avg Size
> >>>----------------------------------------------------------------------------------------------------
> >>>DEV_COMMON_ANY_HIGHPRIORITY_HACO Queue RUNNING 1 0 0
> >>> 0 0.0
> >>>DEV_COMMON_INDEX_CLIENT_127_0_1_1_8081 Queue RUNNING 1 1 0
> >>> 0 0.0
> >>>DEV_COMMON_INDEX_SERVER Queue RUNNING 1 1 0
> >>> 0 0.0
> >>>DEV_COMMON_MONITOR Queue RUNNING 8 1 0
> >>> 0 0.0
> >>>DEV_COMMON_THE_ARCREPOS Queue RUNNING 3 1 0
> >>> 0 0.0
> >>>DEV_COMMON_THE_SCHED Queue RUNNING 1 1 0
> >>> 0 0.0
> >>>DEV_COMMON_THIS_HACO_127_0_1_1_8076 Queue RUNNING 0 1 0
> >>> 0 0.0
> >>>DEV_COMMON_THIS_HACO_127_0_1_1_8081 Queue RUNNING 1 2 0
> >>> 0 0.0
> >>>DEV_sos_ALL_BA Topic RUNNING 1 1 0
> >>> 0 0.0
> >>>DEV_sos_ANY_BA Queue RUNNING 1 1 0
> >>> 0 0.0
> >>>DEV_sos_THE_BAMON Queue RUNNING 2 1 0
> >>> 0 0.0
> >>>mq.sys.dmq Queue RUNNING 0 0 0
> >>> 0 0.0
> >>>
> >>>At present my installation of Netarchive Suite uses only one harvester
> >>>instance. It did the same thing again - uploaded both arc and metadata.arc
> >>>file, but reported failure while uploading the arc file.
> >>>
> >>>Very fresh installation of Netarchive Suite also produced another error
> >>>message:
> >>>
> >>>Cleaned up dk.netarkivet.common.distribute.HTTPRemoteFileRegistry
> >>>Cleaning up dk.netarkivet.common.distribute.monitorregistry.JMSMonitorRegistryClient
> >>>Cleaned up dk.netarkivet.common.distribute.monitorregistry.JMSMonitorRegistryClient
> >>>Cleaning up dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
> >>>Error while cleaning up dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
> >>>java.lang.NullPointerException
> >>> at dk.netarkivet.common.distribute.JMSConnection.removeListener(JMSConnection.java:640)
> >>> at dk.netarkivet.archive.arcrepository.distribute.JMSArcRepositoryClient.close(JMSArcRepositoryClient.java:124)
> >>> at dk.netarkivet.harvester.harvesting.HarvestController.cleanup(HarvestController.java:114)
> >>> at dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer.cleanup(HarvestControllerServer.java:254)
> >>> at dk.netarkivet.common.utils.CleanupHook.run(CleanupHook.java:70)
> >>>Cleaned up dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
> >>>Jul 28, 2008 2:10:49 PM ClientCommunicatorAdmin restart
> >>>WARNING: Failed to restart: java.io.IOException: Failed to get a RMI stub:
> >>>javax.naming.ServiceUnavailableException [Root exception is java.rmi.ConnectException: Connection refused to host: ubuntu804desktop.localdomain; nested exception is:
> >>> java.net.ConnectException: Connection refused]
> >>>Jul 28, 2008 2:10:49 PM RMIConnector RMIClientCommunicatorAdmin-doStop
> >>>WARNING: Failed to call the method close():java.rmi.ConnectException:
> >>>Connection refused to host: 127.0.1.1; nested exception is:
> >>> java.net.ConnectException: Connection refused
> >>>Jul 28, 2008 2:10:49 PM ClientCommunicatorAdmin Checker-run
> >>>WARNING: Failed to check connection: java.net.ConnectException: Connection
> >>>refused
> >>>Jul 28, 2008 2:10:49 PM ClientCommunicatorAdmin Checker-run
> >>>WARNING: stopping
> >>>
> >>>Best,
> >>>Martin Bella
> >>>University Library in Bratislava
> >>>
> >>>
> >>>
> >>>>For me it looks like the ARCRepository does the right thing - uploads the
> >>>>files AND sends "Store OK" back to the harvester. It seems that
> >>>>the harvester does not get that "Store OK" JMS-message - so it times out and
> >>>>tries the same file again (3 times)
> >>>>
> >>>>While uploading could you check how many applications is connected to the
> >>>>JMS-broker on each JMS-queue. This is done with:
> >>>>/opt/sun/mq/bin/imqcmd list dst -u admin -passfile $PASSFILE
> >>>>
> >>>>where the file pointed at by $PASSFILE should contain one line:
> >>>>imq.imqcmd.password=WHAT_EVER_YOUR_PASSWORD_IS_SET_TO
> >>>>
> >>>>the default password should be: admin - meaning that your $PASSFILE will
> >>>>have the following line:
> >>>>imq.imqcmd.password=admin
> >>>>
> >>>>The output from imqcmd command should state if the harvester and the
> >>>>ARCRepository are connected in the right way to the JMS-broker
> >>>>
> >>>>best
> >>>>--
> >>>>Bjarne Andersen
> >>>>Daily Manager - netarchive.dk
> >>>>
> >>>>State & University Library
> >>>>Universitetsparken
> >>>>DK-8000 Aarhus C
> >>>>T: +45 89462165 - C: +45 25662353
> >>>>CVR/SE 10100682 - EAN 5798000791084
> >>>>http://netarchive.dk
> >>>>
> >>>>Martin Bella wrote:
> >>>>
> >>>>
> >>>>>Hi Eld,
> >>>>>
> >>>>>sorry for creating another thread. I did not get the answer in email form, but
> >>>>>I can see it in the "Netarchive-users archives".
> >>>>>
> >>>>>Concerning the wrong checksum, you were right, now it works. Thanks.
> >>>>>Conserning the second problem, I always start JMS as described in the
> >>>>>Installation manual and I use the latest version of JMS, but the problem
> >>>>>resists. Is there anything else (logs,...) I can send you to solve
> >>>>>this?
> >>>>>
> >>>>>Best,
> >>>>>Martin Bella
> >>>>>University Library in Bratislava
> >>>>>_______________________________________________
> >>>>>NetarchiveSuite-users mailing list
> >>>>>NetarchiveSuite-users at lists.gforge.statsbiblioteket.dk
> >>>>>https://lists.gforge.statsbiblioteket.dk/mailman/listinfo/netarchivesuite-users
> >>>>
> >>>_______________________________________________
> >>>NetarchiveSuite-users mailing list
> >>>NetarchiveSuite-users at lists.gforge.statsbiblioteket.dk
> >>>https://lists.gforge.statsbiblioteket.dk/mailman/listinfo/netarchivesuite-users
> >>
> >>--
> >>Bjarne Andersen
> >>Driftsleder - netarkivet.dk
> >>
> >>Statsbiblioteket
> >>Universitetsparken
> >>8000 Århus C
> >>Tlf. 89462165 - Mobil 25662353
> >>CVR/SE 10100682 - EAN 5798000791084
> >>http://netarkivet.dk
> >
> > _______________________________________________
> > NetarchiveSuite-users mailing list
> > NetarchiveSuite-users at lists.gforge.statsbiblioteket.dk
> > https://lists.gforge.statsbiblioteket.dk/mailman/listinfo/netarchivesuite-users
>
> --
> Bjarne Andersen
> Driftsleder - netarkivet.dk
>
> Statsbiblioteket
> Universitetsparken
> 8000 Århus C
> Tlf. 89462165 - Mobil 25662353
> CVR/SE 10100682 - EAN 5798000791084
> http://netarkivet.dk
More information about the NetarchiveSuite-users
mailing list