[Netarchivesuite-users] Problems With the HarvestController With 4.4.1

Mikis Seth Sørensen mss at statsbiblioteket.dk
Wed Dec 3 11:07:01 CET 2014


Hi Charles

This looks like a known issue with inconsistent harvestchannel
configurations. The Harvest channels are configured in 2 places:
* The harvest server through the harvest channel GUI as described here:
https://sbforge.org/display/NASDOC44/Harvest+Channels.
* The configuration for the individual harvest controllers in the deployed
settings-XXX.xml, see
https://sbforge.org/display/NASDOC44/The+Deploy+Configuration+File#TheDeplo
yConfigurationFile-Howtoaddaharvestermoreonthesamemachineandsetalltoselecti
veharvesting(FOCUSEDchannel).

So the harvest controller channel setting is used during the controllers
startup to register the controller to a specific harvester channel pool
defined on the server. If the server hasn¹t got a pool defined with a name
corresponding to the one sent in the regitration message from the
HarvestController, the registration fails with a error message like the
one you see.

This problem is typically seen as part of the switch from the naming of
Harvester obsolete hardcoded Œpriority¹s for harvesters ŒHIGH_PRIORITY¹
and LOW_PRIORITY¹ to the dynamic named Harvest channels ¹SNAPSHOT¹ and
ŒFOCUSED¹. So this is exactly what you see in the logs, the harvest server
doesn¹t recognize the LOWPRIORITY/HIGHPRIORITY and sends a response
stating the channel is invalid. On the other hand the
HarvestJobManagerApplication on the server states that no controllers are
registrering on the 'FOCUSED¹ channel.

So to fixed the problem I would suggest updating the settings for the
HarvestController so that the LOWPRIORITY/HIGHPRIORITY is changed to
SNAPSHOT/FOCUSED. You can check that the actual server naming of the
channels through the https://sbforge.org/display/NASDOC44/Harvest+Channels
GUI. 

Hope this helps
Mikis   

On 12/2/14, 4:03 PM, "Søren Vejrup Carlsen" <svc at kb.dk> wrote:

>Hi Charles.
>No, it is not a known problem as far as I know. With "Currently"  I
>meant, that your mail has prompted us to look into the problem.
>
>Thanks for the loglines.
>I have now looked into the java code behind this:
> dk.netarkivet.harvester.scheduler.HarvesterStatusReceiver visit
>INFO: Sent a message to notify that harvest channel 'HIGHPRIORITY' is
>invalid.
>
>And the channel is only invalid, if the channel requested is not in
>harvestchannel table.
>Any channels used by the harvesters must be added to the harvestchannel
>table. 
>
>
>Best Regards 
>Søren
>________________________________________
>Fra: NetarchiveSuite-users [netarchivesuite-users-bounces at ml.sbforge.org]
>på vegne af Charles Tassell [ctassell at gmail.com]
>Sendt: 2. december 2014 15:41
>Til: netarchivesuite-users at ml.sbforge.org
>Emne: Re: [Netarchivesuite-users] Problems With the HarvestController
>With 4.4.1
>
>Hi Søren,
>
>   Thanks, so this is a known problem?  Would downgrading to 4.4.0 help
>or would I have to go back to 4.3 or earlier?
>
>   Here is my HarvestJobManagerApplication0.log.0 file after a restart
>if that helps.  It says that the HIGH/LOWPRIORITY and FOCUSED channels
>have not been created, so yeah it seems to be an issue with talking to
>OpenMQ...  I checked the OpenMQ logs and there are no real errors...
>The NAS components connect as the guest user fine.
>
>2-Dec-2014 11:15:03 AM dk.netarkivet.common.utils.Settings getAll
>FINE: Searching for a setting for key:
>settings.common.replicas.replica.replicaId
>2-Dec-2014 11:15:03 AM dk.netarkivet.common.utils.Settings getAll
>FINE: Value found in loaded data: A
>2-Dec-2014 11:15:02 AM
>dk.netarkivet.common.distribute.JMSConnectionSunMQ <init>
>INFO: Creating instance of
>dk.netarkivet.common.distribute.JMSConnectionSunMQ
>2-Dec-2014 11:15:08 AM
>dk.netarkivet.common.distribute.JMSConnectionSunMQ getConnectionFactory
>INFO: Establishing SunMQ JMS Connection to 'localhost:7676'
>2-Dec-2014 11:15:12 AM
>dk.netarkivet.harvester.datamodel.HarvestDBConnection initDataSource
>INFO: Connection pool initialized with the following values:
>- minPoolSize=5
>- maxPoolSize=10
>- acquireIncrement=5
>- maxStatements=0
>- maxStatementsPerConnection=0
>- idleConnTestPeriod=0
>- idleConnTestQuery='null'
>- idleConnTestOnCheckin=false
>2-Dec-2014 11:15:14 AM dk.netarkivet.harvester.scheduler.JobDispatcher
><init>
>INFO: Creating JobDispatcher
>netarchive at webarchive:~/ROBLIB/conf/log$ cat
>HarvestJobManagerApplication0.log.0
>2-Dec-2014 11:15:03 AM dk.netarkivet.common.utils.Settings getAll
>FINE: Searching for a setting for key:
>settings.common.replicas.replica.replicaId
>2-Dec-2014 11:15:03 AM dk.netarkivet.common.utils.Settings getAll
>FINE: Value found in loaded data: A
>2-Dec-2014 11:15:02 AM
>dk.netarkivet.common.distribute.JMSConnectionSunMQ <init>
>INFO: Creating instance of
>dk.netarkivet.common.distribute.JMSConnectionSunMQ
>2-Dec-2014 11:15:08 AM
>dk.netarkivet.common.distribute.JMSConnectionSunMQ getConnectionFactory
>INFO: Establishing SunMQ JMS Connection to 'localhost:7676'
>2-Dec-2014 11:15:12 AM
>dk.netarkivet.harvester.datamodel.HarvestDBConnection initDataSource
>INFO: Connection pool initialized with the following values:
>- minPoolSize=5
>- maxPoolSize=10
>- acquireIncrement=5
>- maxStatements=0
>- maxStatementsPerConnection=0
>- idleConnTestPeriod=0
>- idleConnTestQuery='null'
>- idleConnTestOnCheckin=false
>2-Dec-2014 11:15:14 AM dk.netarkivet.harvester.scheduler.JobDispatcher
><init>
>INFO: Creating JobDispatcher
>2-Dec-2014 11:15:14 AM dk.netarkivet.common.utils.ApplicationUtils
>logAndPrint
>INFO: Starting dk.netarkivet.harvester.scheduler.HarvestJobManager
>Version: 4.4.1 status RELEASE
>2-Dec-2014 11:15:14 AM dk.netarkivet.common.utils.ApplicationUtils
>startApp
>INFO: Using settings files
>'/opt/netarchive/ROBLIB/conf/settings_HarvestJobManagerApplication.xml'
>2-Dec-2014 11:15:14 AM
>dk.netarkivet.monitor.distribute.JMSMonitorRegistryClient register
>INFO: Registering this client for monitoring every 1 minutes, using
>hostname 'webarchive.upei.ca' and JMX/RMI ports 8118/8218
>2-Dec-2014 11:15:14 AM
>dk.netarkivet.common.management.MBeanConnectorCreator exposeJMXMBeanServer
>INFO: Registered mbean server in registry on port 8118 communicating on
>port 8218 using password file 'conf/jmxremote.password'.
>Service URL is
>service:jmx:rmi://webarchive.upei.ca:8218/jndi/rmi://webarchive.upei.ca:81
>18/jmxrmi
>2-Dec-2014 11:15:14 AM dk.netarkivet.common.lifecycle.LifeCycleComponent
>start
>FINE: Starting 
>dk.netarkivet.harvester.scheduler.HarvestJobManager at 25dd9891
>2-Dec-2014 11:15:14 AM dk.netarkivet.common.utils.ApplicationUtils
>logAndPrint
>INFO: dk.netarkivet.harvester.scheduler.HarvestJobManager Running
>2-Dec-2014 11:15:14 AM dk.netarkivet.harvester.scheduler.JobSupervisor
>rescheduleLeftOverJobs
>INFO: 0 jobs has been resubmitted.
>2-Dec-2014 11:15:15 AM
>dk.netarkivet.harvester.scheduler.HarvesterStatusReceiver visit
>INFO: Sent a message to notify that harvest channel 'LOWPRIORITY' is
>invalid.
>2-Dec-2014 11:15:15 AM
>dk.netarkivet.harvester.scheduler.HarvesterStatusReceiver visit
>INFO: Sent a message to notify that harvest channel 'HIGHPRIORITY' is
>invalid.
>2-Dec-2014 11:15:15 AM
>dk.netarkivet.harvester.datamodel.HarvestDefinitionDBDAO read
>FINE: Reading harvestdefinition w/ id 1
>2-Dec-2014 11:15:15 AM
>dk.netarkivet.harvester.datamodel.HarvestDefinitionDBDAO read
>FINE: Partialharvest found w/ id 1
>2-Dec-2014 11:15:16 AM dk.netarkivet.harvester.datamodel.ScheduleDBDAO
>read
>FINE: Creating frequency for (timeunit,anytime,numtimeunits,hour,
>minute, dayofweek,dayofmonth) = (2, true,1,null,null,null,null,)
>2-Dec-2014 11:15:16 AM dk.netarkivet.harvester.datamodel.Frequency
>getNewInstance
>FINE: Creating a DAILY frequency.
>2-Dec-2014 11:15:16 AM dk.netarkivet.harvester.datamodel.ScheduleDBDAO
>read
>FINE: Creating frequency for (timeunit,anytime,numtimeunits,hour,
>minute, dayofweek,dayofmonth) = (2, true,1,null,null,null,null,)
>2-Dec-2014 11:15:16 AM dk.netarkivet.harvester.datamodel.Frequency
>getNewInstance
>FINE: Creating a DAILY frequency.
>2-Dec-2014 11:15:16 AM
>dk.netarkivet.harvester.scheduler.HarvestJobGenerator$JobGeneratorTask
>generateJobs
>INFO: Harvest channel 'FOCUSED' has not yet been registered by any
>harvester, hence harvest definition 'UniversityIsland' (1) cannot be
>processed by the job generator for now.
>
>On 14-12-02 10:09 AM, Søren Vejrup Carlsen wrote:
>> Hi Charles.
>> We are currently investigating the problem.
>> So please be patient.
>>
>> The cause is at first sight, that the attempt by the harvester to
>>register itself fails.
>> Reason why is that the response message is invalid. Why this is so, I
>>don't know.
>>
>> There could be some information in the HarvestJobManagerApplication
>>logs as to why the registering fails?
>>
>> Best regards
>>
>> Søren V. Carlsen
>>
>> ________________________________________
>> Fra: NetarchiveSuite-users
>>[netarchivesuite-users-bounces at ml.sbforge.org] på vegne af Charles
>>Tassell [charles at islandadmin.ca]
>> Sendt: 2. december 2014 14:08
>> Til: netarchivesuite-users at ml.sbforge.org
>> Emne: [Netarchivesuite-users] Problems With the HarvestController With
>>4.4.1
>>
>> Hi Everyone,
>>
>>     Sorry if this is a dupe, I posted it to the list yesterday but I
>> don't think it went through as I wasn't subscribed.
>>
>>     I've recently been trying to get NetArchiverSuite 4.4.1 running but
>> I'm not having much luck, and it seems to be related to the interaction
>> between the controllers and the message broker service. Whenever I start
>> the high/low controller processes I get the following in the startup
>>logs:
>>
>> Starting
>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>> Version: 4.4.1 status RELEASE
>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>> Running
>> 1-Dec-2014 10:49:53 AM
>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>visit
>> SEVERE: Received message stating that channel 'HIGHPRIORITY' is invalid.
>> Will stop.
>>
>> Starting
>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>> Version: 4.4.1 status RELEASE
>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>> Running
>> 1-Dec-2014 10:49:53 AM
>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>visit
>> SEVERE: Received message stating that channel 'LOWPRIORITY' is invalid.
>> Will stop.
>>
>>     I saw an earlier post from October where Mikis recommended
>>restarting
>> the broker and making sure the queue was clear.  I did that but the
>> problem persists.  Any ideas as to how to fix it?  When I look at the
>> broker logs I see the system auto-creating queues like
>> HOSTNAME_COMMON_THIS_REPOS_CLIENT_IP_HCS_HIGH but no HIGHPRIORITY or
>> LOWPRIORITY so I'm wondering if I need to create some sort of user
>> accounts on the broker for the system to be able to register the
>>channels.
>>
>>     Oh, and on another note, in the 4.4.1 deployment script the log and
>> conf directories are not created automatically, which causes the install
>> to fail (the start/kill scripts and config settings aren't copied
>> over.)  I manually fixed this on my system during the install process
>> but I thought I should mention it.
>>
>> _______________________________________________
>> NetarchiveSuite-users mailing list
>> NetarchiveSuite-users at ml.sbforge.org
>> http://ml.sbforge.org/mailman/listinfo/netarchivesuite-users
>>
>> _______________________________________________
>> NetarchiveSuite-users mailing list
>> NetarchiveSuite-users at ml.sbforge.org
>> http://ml.sbforge.org/mailman/listinfo/netarchivesuite-users
>
>_______________________________________________
>NetarchiveSuite-users mailing list
>NetarchiveSuite-users at ml.sbforge.org
>http://ml.sbforge.org/mailman/listinfo/netarchivesuite-users
>
>_______________________________________________
>NetarchiveSuite-users mailing list
>NetarchiveSuite-users at ml.sbforge.org
>http://ml.sbforge.org/mailman/listinfo/netarchivesuite-users




More information about the NetarchiveSuite-users mailing list