[Netarchivesuite-users] Problems With the HarvestController With 4.4.1

Mikis Seth Sørensen mss at statsbiblioteket.dk
Thu Dec 4 08:52:48 CET 2014


Great you got it working :-)

Thanks for the feedback. We’ll have a look at the missing conf and log
folders, and update the remaining deploy configurations with the new
harvest channels names.

Cheers
Mikis

On 12/3/14, 6:42 PM, "Charles Tassell" <ctassell at gmail.com> wrote:

>Hi Mikis,
>
>   Thanks, I think that got me sorted out.  I wasn't able to get into
>the harvest channels so I disabled IPV6 on my machine then reinstalled
>using the
>deploy_standalone_example.xml rather than
>deploy_standalone_example_with_wayback_apps.xml The non-wayback example
>has the correct queue names in it and it seems to be working fine, so my
>issues are probably related to the example file not being updated with
>the new queue names.  I'll cut and paste them in and give that a try
>after my test harvest completes.
>
>   Just a note, while debugging I tried installing 4.4.0 and 4.2.0 and
>they do properly create the QUICKSTART/{log,conf} folders, so those two
>dirs not being created does seem to be an issue with the 4.4.1 install
>script.
>
>
>On 14-12-03 06:07 AM, Mikis Seth Sørensen wrote:
>> Hi Charles
>>
>> This looks like a known issue with inconsistent harvestchannel
>> configurations. The Harvest channels are configured in 2 places:
>> * The harvest server through the harvest channel GUI as described here:
>> https://sbforge.org/display/NASDOC44/Harvest+Channels.
>> * The configuration for the individual harvest controllers in the
>>deployed
>> settings-XXX.xml, see
>>
>>https://sbforge.org/display/NASDOC44/The+Deploy+Configuration+File#TheDep
>>lo
>>
>>yConfigurationFile-Howtoaddaharvestermoreonthesamemachineandsetalltoselec
>>ti
>> veharvesting(FOCUSEDchannel).
>>
>> So the harvest controller channel setting is used during the controllers
>> startup to register the controller to a specific harvester channel pool
>> defined on the server. If the server hasn¹t got a pool defined with a
>>name
>> corresponding to the one sent in the regitration message from the
>> HarvestController, the registration fails with a error message like the
>> one you see.
>>
>> This problem is typically seen as part of the switch from the naming of
>> Harvester obsolete hardcoded Œpriority¹s for harvesters ŒHIGH_PRIORITY¹
>> and LOW_PRIORITY¹ to the dynamic named Harvest channels ¹SNAPSHOT¹ and
>> ŒFOCUSED¹. So this is exactly what you see in the logs, the harvest
>>server
>> doesn¹t recognize the LOWPRIORITY/HIGHPRIORITY and sends a response
>> stating the channel is invalid. On the other hand the
>> HarvestJobManagerApplication on the server states that no controllers
>>are
>> registrering on the 'FOCUSED¹ channel.
>>
>> So to fixed the problem I would suggest updating the settings for the
>> HarvestController so that the LOWPRIORITY/HIGHPRIORITY is changed to
>> SNAPSHOT/FOCUSED. You can check that the actual server naming of the
>> channels through the
>>https://sbforge.org/display/NASDOC44/Harvest+Channels
>> GUI.
>>
>> Hope this helps
>> Mikis
>>
>> On 12/2/14, 4:03 PM, "Søren Vejrup Carlsen" <svc at kb.dk> wrote:
>>
>>> Hi Charles.
>>> No, it is not a known problem as far as I know. With "Currently"  I
>>> meant, that your mail has prompted us to look into the problem.
>>>
>>> Thanks for the loglines.
>>> I have now looked into the java code behind this:
>>> dk.netarkivet.harvester.scheduler.HarvesterStatusReceiver visit
>>> INFO: Sent a message to notify that harvest channel 'HIGHPRIORITY' is
>>> invalid.
>>>
>>> And the channel is only invalid, if the channel requested is not in
>>> harvestchannel table.
>>> Any channels used by the harvesters must be added to the harvestchannel
>>> table.
>>>
>>>
>>> Best Regards
>>> Søren
>>> ________________________________________
>>> Fra: NetarchiveSuite-users
>>>[netarchivesuite-users-bounces at ml.sbforge.org]
>>> på vegne af Charles Tassell [ctassell at gmail.com]
>>> Sendt: 2. december 2014 15:41
>>> Til: netarchivesuite-users at ml.sbforge.org
>>> Emne: Re: [Netarchivesuite-users] Problems With the HarvestController
>>> With 4.4.1
>>>
>>> Hi Søren,
>>>
>>>    Thanks, so this is a known problem?  Would downgrading to 4.4.0 help
>>> or would I have to go back to 4.3 or earlier?
>>>
>>>    Here is my HarvestJobManagerApplication0.log.0 file after a restart
>>> if that helps.  It says that the HIGH/LOWPRIORITY and FOCUSED channels
>>> have not been created, so yeah it seems to be an issue with talking to
>>> OpenMQ...  I checked the OpenMQ logs and there are no real errors...
>>> The NAS components connect as the guest user fine.
>>>
>>> 2-Dec-2014 11:15:03 AM dk.netarkivet.common.utils.Settings getAll
>>> FINE: Searching for a setting for key:
>>> settings.common.replicas.replica.replicaId
>>> 2-Dec-2014 11:15:03 AM dk.netarkivet.common.utils.Settings getAll
>>> FINE: Value found in loaded data: A
>>> 2-Dec-2014 11:15:02 AM
>>> dk.netarkivet.common.distribute.JMSConnectionSunMQ <init>
>>> INFO: Creating instance of
>>> dk.netarkivet.common.distribute.JMSConnectionSunMQ
>>> 2-Dec-2014 11:15:08 AM
>>> dk.netarkivet.common.distribute.JMSConnectionSunMQ getConnectionFactory
>>> INFO: Establishing SunMQ JMS Connection to 'localhost:7676'
>>> 2-Dec-2014 11:15:12 AM
>>> dk.netarkivet.harvester.datamodel.HarvestDBConnection initDataSource
>>> INFO: Connection pool initialized with the following values:
>>> - minPoolSize=5
>>> - maxPoolSize=10
>>> - acquireIncrement=5
>>> - maxStatements=0
>>> - maxStatementsPerConnection=0
>>> - idleConnTestPeriod=0
>>> - idleConnTestQuery='null'
>>> - idleConnTestOnCheckin=false
>>> 2-Dec-2014 11:15:14 AM dk.netarkivet.harvester.scheduler.JobDispatcher
>>> <init>
>>> INFO: Creating JobDispatcher
>>> netarchive at webarchive:~/ROBLIB/conf/log$ cat
>>> HarvestJobManagerApplication0.log.0
>>> 2-Dec-2014 11:15:03 AM dk.netarkivet.common.utils.Settings getAll
>>> FINE: Searching for a setting for key:
>>> settings.common.replicas.replica.replicaId
>>> 2-Dec-2014 11:15:03 AM dk.netarkivet.common.utils.Settings getAll
>>> FINE: Value found in loaded data: A
>>> 2-Dec-2014 11:15:02 AM
>>> dk.netarkivet.common.distribute.JMSConnectionSunMQ <init>
>>> INFO: Creating instance of
>>> dk.netarkivet.common.distribute.JMSConnectionSunMQ
>>> 2-Dec-2014 11:15:08 AM
>>> dk.netarkivet.common.distribute.JMSConnectionSunMQ getConnectionFactory
>>> INFO: Establishing SunMQ JMS Connection to 'localhost:7676'
>>> 2-Dec-2014 11:15:12 AM
>>> dk.netarkivet.harvester.datamodel.HarvestDBConnection initDataSource
>>> INFO: Connection pool initialized with the following values:
>>> - minPoolSize=5
>>> - maxPoolSize=10
>>> - acquireIncrement=5
>>> - maxStatements=0
>>> - maxStatementsPerConnection=0
>>> - idleConnTestPeriod=0
>>> - idleConnTestQuery='null'
>>> - idleConnTestOnCheckin=false
>>> 2-Dec-2014 11:15:14 AM dk.netarkivet.harvester.scheduler.JobDispatcher
>>> <init>
>>> INFO: Creating JobDispatcher
>>> 2-Dec-2014 11:15:14 AM dk.netarkivet.common.utils.ApplicationUtils
>>> logAndPrint
>>> INFO: Starting dk.netarkivet.harvester.scheduler.HarvestJobManager
>>> Version: 4.4.1 status RELEASE
>>> 2-Dec-2014 11:15:14 AM dk.netarkivet.common.utils.ApplicationUtils
>>> startApp
>>> INFO: Using settings files
>>> '/opt/netarchive/ROBLIB/conf/settings_HarvestJobManagerApplication.xml'
>>> 2-Dec-2014 11:15:14 AM
>>> dk.netarkivet.monitor.distribute.JMSMonitorRegistryClient register
>>> INFO: Registering this client for monitoring every 1 minutes, using
>>> hostname 'webarchive.upei.ca' and JMX/RMI ports 8118/8218
>>> 2-Dec-2014 11:15:14 AM
>>> dk.netarkivet.common.management.MBeanConnectorCreator
>>>exposeJMXMBeanServer
>>> INFO: Registered mbean server in registry on port 8118 communicating on
>>> port 8218 using password file 'conf/jmxremote.password'.
>>> Service URL is
>>>
>>>service:jmx:rmi://webarchive.upei.ca:8218/jndi/rmi://webarchive.upei.ca:
>>>81
>>> 18/jmxrmi
>>> 2-Dec-2014 11:15:14 AM
>>>dk.netarkivet.common.lifecycle.LifeCycleComponent
>>> start
>>> FINE: Starting
>>> dk.netarkivet.harvester.scheduler.HarvestJobManager at 25dd9891
>>> 2-Dec-2014 11:15:14 AM dk.netarkivet.common.utils.ApplicationUtils
>>> logAndPrint
>>> INFO: dk.netarkivet.harvester.scheduler.HarvestJobManager Running
>>> 2-Dec-2014 11:15:14 AM dk.netarkivet.harvester.scheduler.JobSupervisor
>>> rescheduleLeftOverJobs
>>> INFO: 0 jobs has been resubmitted.
>>> 2-Dec-2014 11:15:15 AM
>>> dk.netarkivet.harvester.scheduler.HarvesterStatusReceiver visit
>>> INFO: Sent a message to notify that harvest channel 'LOWPRIORITY' is
>>> invalid.
>>> 2-Dec-2014 11:15:15 AM
>>> dk.netarkivet.harvester.scheduler.HarvesterStatusReceiver visit
>>> INFO: Sent a message to notify that harvest channel 'HIGHPRIORITY' is
>>> invalid.
>>> 2-Dec-2014 11:15:15 AM
>>> dk.netarkivet.harvester.datamodel.HarvestDefinitionDBDAO read
>>> FINE: Reading harvestdefinition w/ id 1
>>> 2-Dec-2014 11:15:15 AM
>>> dk.netarkivet.harvester.datamodel.HarvestDefinitionDBDAO read
>>> FINE: Partialharvest found w/ id 1
>>> 2-Dec-2014 11:15:16 AM dk.netarkivet.harvester.datamodel.ScheduleDBDAO
>>> read
>>> FINE: Creating frequency for (timeunit,anytime,numtimeunits,hour,
>>> minute, dayofweek,dayofmonth) = (2, true,1,null,null,null,null,)
>>> 2-Dec-2014 11:15:16 AM dk.netarkivet.harvester.datamodel.Frequency
>>> getNewInstance
>>> FINE: Creating a DAILY frequency.
>>> 2-Dec-2014 11:15:16 AM dk.netarkivet.harvester.datamodel.ScheduleDBDAO
>>> read
>>> FINE: Creating frequency for (timeunit,anytime,numtimeunits,hour,
>>> minute, dayofweek,dayofmonth) = (2, true,1,null,null,null,null,)
>>> 2-Dec-2014 11:15:16 AM dk.netarkivet.harvester.datamodel.Frequency
>>> getNewInstance
>>> FINE: Creating a DAILY frequency.
>>> 2-Dec-2014 11:15:16 AM
>>> dk.netarkivet.harvester.scheduler.HarvestJobGenerator$JobGeneratorTask
>>> generateJobs
>>> INFO: Harvest channel 'FOCUSED' has not yet been registered by any
>>> harvester, hence harvest definition 'UniversityIsland' (1) cannot be
>>> processed by the job generator for now.
>>>
>>> On 14-12-02 10:09 AM, Søren Vejrup Carlsen wrote:
>>>> Hi Charles.
>>>> We are currently investigating the problem.
>>>> So please be patient.
>>>>
>>>> The cause is at first sight, that the attempt by the harvester to
>>>> register itself fails.
>>>> Reason why is that the response message is invalid. Why this is so, I
>>>> don't know.
>>>>
>>>> There could be some information in the HarvestJobManagerApplication
>>>> logs as to why the registering fails?
>>>>
>>>> Best regards
>>>>
>>>> Søren V. Carlsen
>>>>
>>>> ________________________________________
>>>> Fra: NetarchiveSuite-users
>>>> [netarchivesuite-users-bounces at ml.sbforge.org] på vegne af
>>>>Charles
>>>> Tassell [charles at islandadmin.ca]
>>>> Sendt: 2. december 2014 14:08
>>>> Til: netarchivesuite-users at ml.sbforge.org
>>>> Emne: [Netarchivesuite-users] Problems With the HarvestController With
>>>> 4.4.1
>>>>
>>>> Hi Everyone,
>>>>
>>>>      Sorry if this is a dupe, I posted it to the list yesterday but I
>>>> don't think it went through as I wasn't subscribed.
>>>>
>>>>      I've recently been trying to get NetArchiverSuite 4.4.1 running
>>>>but
>>>> I'm not having much luck, and it seems to be related to the
>>>>interaction
>>>> between the controllers and the message broker service. Whenever I
>>>>start
>>>> the high/low controller processes I get the following in the startup
>>>> logs:
>>>>
>>>> Starting
>>>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>>> Version: 4.4.1 status RELEASE
>>>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>>> Running
>>>> 1-Dec-2014 10:49:53 AM
>>>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>>> visit
>>>> SEVERE: Received message stating that channel 'HIGHPRIORITY' is
>>>>invalid.
>>>> Will stop.
>>>>
>>>> Starting
>>>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>>> Version: 4.4.1 status RELEASE
>>>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>>> Running
>>>> 1-Dec-2014 10:49:53 AM
>>>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>>> visit
>>>> SEVERE: Received message stating that channel 'LOWPRIORITY' is
>>>>invalid.
>>>> Will stop.
>>>>
>>>>      I saw an earlier post from October where Mikis recommended
>>>> restarting
>>>> the broker and making sure the queue was clear.  I did that but the
>>>> problem persists.  Any ideas as to how to fix it?  When I look at the
>>>> broker logs I see the system auto-creating queues like
>>>> HOSTNAME_COMMON_THIS_REPOS_CLIENT_IP_HCS_HIGH but no HIGHPRIORITY or
>>>> LOWPRIORITY so I'm wondering if I need to create some sort of user
>>>> accounts on the broker for the system to be able to register the
>>>> channels.
>>>>
>>>>      Oh, and on another note, in the 4.4.1 deployment script the log
>>>>and
>>>> conf directories are not created automatically, which causes the
>>>>install
>>>> to fail (the start/kill scripts and config settings aren't copied
>>>> over.)  I manually fixed this on my system during the install process
>>>> but I thought I should mention it.
>>>>
>>>> _______________________________________________
>>>> NetarchiveSuite-users mailing list
>>>> NetarchiveSuite-users at ml.sbforge.org
>>>> http://ml.sbforge.org/mailman/listinfo/netarchivesuite-users
>>>>
>>>> _______________________________________________
>>>> NetarchiveSuite-users mailing list
>>>> NetarchiveSuite-users at ml.sbforge.org
>>>> http://ml.sbforge.org/mailman/listinfo/netarchivesuite-users
>>> _______________________________________________
>>> NetarchiveSuite-users mailing list
>>> NetarchiveSuite-users at ml.sbforge.org
>>> http://ml.sbforge.org/mailman/listinfo/netarchivesuite-users
>>>
>>> _______________________________________________
>>> NetarchiveSuite-users mailing list
>>> NetarchiveSuite-users at ml.sbforge.org
>>> http://ml.sbforge.org/mailman/listinfo/netarchivesuite-users
>>
>> _______________________________________________
>> NetarchiveSuite-users mailing list
>> NetarchiveSuite-users at ml.sbforge.org
>> http://ml.sbforge.org/mailman/listinfo/netarchivesuite-users
>
>_______________________________________________
>NetarchiveSuite-users mailing list
>NetarchiveSuite-users at ml.sbforge.org
>http://ml.sbforge.org/mailman/listinfo/netarchivesuite-users




More information about the NetarchiveSuite-users mailing list