[Netarchivesuite-users] NetarchiveSuite Version: 4.4.0 does not run jobs

Mikis Seth Sørensen mss at statsbiblioteket.dk
Wed Oct 29 15:58:02 CET 2014


Sounds weird, it looks like the channel registered by the
HarvestControlServer and the HarvestJobManagers harvestchannel table agree
on the exists of a FOCUSED channel.

Does the HarvestJobManager log contain any relevant information about the
FOCUSED channel conflict?

Best
Mikis

On 10/29/14, 11:17 AM, "Meelis Mihhailov" <meelis at nlib.ee> wrote:

>Hi Nicholas
>
>Checked it and there are two values present :)
>
>
>crawler=# SELECT * from harvestchannel;
>  id |   name   | issnapshot | isdefault |            comments
>----+----------+------------+-----------+--------------------------------
>   2 | FOCUSED  | f          | t         | Channel for selective harvests
>   1 | SNAPSHOT | t          | t         | Channel for snapshot harvests
>(2 rows)
>
>
>
>-----------------------------------------------------
>Meelis Mihhailov
>Süsteemiadministraator / Systemadministrator
>Eesti Rahvusraamatukogu / National Library Of Estonia
>
>Telefon: 630 7178 / Phone: +372 630 7178
>E-post: meelis at nlib.ee / E-mail: meelis at nlib.ee
>
>Tõnismägi 2, 15189 Tallinn, ESTONIA
>
>www.eestirahvusraamatukogu.ee
>-----------------------------------------------------
>
>On 29.10.2014 12:03, Nicholas Clarke wrote:
>> It looks like the harvestchannel table in the postgresql database is
>>empty?
>>
>> -Nicholas
>>
>>> -----Oprindelig meddelelse-----
>>> Fra: NetarchiveSuite-users [mailto:netarchivesuite-users-
>>> bounces at ml.sbforge.org] På vegne af Meelis Mihhailov
>>> Sendt: 29. oktober 2014 10:42
>>> Til: netarchive Suite Users
>>> Emne: [Netarchivesuite-users] NetarchiveSuite Version: 4.4.0 does not
>>> run jobs
>>>
>>> Hi all!
>>>
>>> Installed version 4.4.0 with quick start setup running on PostgreSQL
>>> database. Installed all the needed sql files, did the updates, added
>>> indexes and after running the start all script I can access web
>>> interface and add definitions, configurations etc.
>>>
>>> Problem is when I activate a job it wont go past "new" status.
>>>
>>> Checked the logs and in the
>>> HarvestJobManagerApplication_testcrawler0.log.0 file I can see this:
>>>
>>> ----------------------------------------------------------------------
>>> FINE: Creating Job 1 (state = NEW, HD = 1, channel = FOCUSED, snapshot
>>> =
>>> false, forcemaxcount = -1, forcemaxbytes = 1000000000,
>>> forcemaxrunningtime = 0, orderxml = default_orderxml, numconfigs = 1,
>>> created = Wed Oct 29 10:48:13 EET 2014)
>>> Oct 29, 2014 10:48:13 AM dk.netarkivet.harvester.datamodel.Job
>>> getHarvestFilenamePrefix
>>> WARNING: HarvestnamePrefix not yet set for job 1. Set it by using the
>>> naming scheme. This should only happen for old jobs being read
>>> Oct 29, 2014 10:48:13 AM dk.netarkivet.harvester.datamodel.Job
>>> setDefaultHarvestNamePrefix
>>> FINE: Applying the default ArchiveFileNaming class
>>> 'dk.netarkivet.harvester.harvesting.LegacyNamingConvention'.
>>> Oct 29, 2014 10:48:13 AM dk.netarkivet.harvester.datamodel.Job
>>> setDefaultHarvestNamePrefix
>>> FINE: The harvestPrefix of this job is: 1-1
>>> Oct 29, 2014 10:48:14 AM
>>> dk.netarkivet.harvester.scheduler.jobgen.DefaultJobGenerator
>>> processDomainConfigurationSubset
>>> FINE: Created # 1 jobs for harvest # 1
>>> Oct 29, 2014 10:48:14 AM
>>> dk.netarkivet.harvester.scheduler.jobgen.AbstractJobGenerator
>>> generateJobs
>>> INFO: Finished generating 1 jobs for harvestdefinition # 1
>>> Oct 29, 2014 10:48:14 AM
>>> dk.netarkivet.harvester.scheduler.HarvestJobGenerator$JobGeneratorTask$
>>> 1 run
>>> INFO: Created 1 jobs for harvest definition (MEELIS)
>>> Oct 29, 2014 10:48:14 AM
>>> dk.netarkivet.harvester.datamodel.HarvestDefinitionDBDAO update
>>> FINE: 1 partialharvests records updated
>>> Oct 29, 2014 10:48:14 AM
>>> dk.netarkivet.harvester.scheduler.HarvestJobGenerator$JobGeneratorTask$
>>> 1 run
>>> FINE: Removed HD #1(MEELIS) from list of harvestdefinitions to be
>>> scheduled. Harvestdefinitions still to be scheduled: []
>>> ---------------------------------------------------------------------
>>>
>>> System state says that the job has been created but from that moment
>>> ...
>>> nothing happens. It just stays there with status "New" and NAS is doing
>>> nothing.
>>>
>>> There are however some interesting statuses in the system state for
>>> some
>>> of the harvest applications. For example application high_11:
>>>
>>> -------------------------------------------------------------------
>>> veebiarhiiv HarvestControllerServer high_11
>>>     FOCUSED ReplicaA        0
>>> Oct 29, 2014 10:43:01 AM
>>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>> <init>
>>> INFO: Requested to check the validity of harvest channel 'FOCUSED'
>>>
>>> veebiarhiiv HarvestControllerServer high_11
>>>     FOCUSED ReplicaA        1
>>> Oct 29, 2014 10:43:01 AM
>>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>> close
>>> INFO: Closed down HarvestControllerServer
>>>
>>> veebiarhiiv HarvestControllerServer high_11
>>>     FOCUSED ReplicaA        2
>>> Oct 29, 2014 10:43:01 AM dk.netarkivet.common.distribute.JMSConnection
>>> removeListener
>>> INFO: Removing listener from channel
>>> 'TESTCRAWLER_COMMON_HCHAN_VAL_RESP'
>>>
>>> veebiarhiiv HarvestControllerServer high_11
>>>     FOCUSED ReplicaA        3
>>> Oct 29, 2014 10:43:01 AM dk.netarkivet.common.distribute.JMSConnection
>>> removeListener
>>> INFO: Removing listener from channel
>>> 'TESTCRAWLER_COMMON_THIS_REPOS_CLIENT_127_0_1_1_HCS_HIGH_11'
>>>
>>> veebiarhiiv HarvestControllerServer high_11
>>>     FOCUSED ReplicaA        4
>>> Oct 29, 2014 10:43:01 AM
>>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>> close
>>> INFO: Closing HarvestControllerServer.
>>>
>>> veebiarhiiv HarvestControllerServer high_11
>>>     FOCUSED ReplicaA        5
>>> Oct 29, 2014 10:43:01 AM
>>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>> visit
>>> SEVERE: Received message stating that channel 'FOCUSED' is invalid.
>>> Will
>>> stop.
>>>
>>> veebiarhiiv HarvestControllerServer high_11
>>>     FOCUSED ReplicaA        6
>>> Oct 29, 2014 10:43:01 AM
>>> dk.netarkivet.archive.arcrepository.distribute.JMSArcRepositoryClient
>>> <init>
>>> INFO: JMSArcRepository listens for replies on channel '[Queue
>>> 'TESTCRAWLER_COMMON_THIS_REPOS_CLIENT_
>>> 127_0_1_1_HCS_HIGH_11']'
>>>
>>> veebiarhiiv HarvestControllerServer high_11
>>>     FOCUSED ReplicaA        7
>>> Oct 29, 2014 10:43:01 AM
>>> dk.netarkivet.archive.arcrepository.distribute.JMSArcRepositoryClient
>>> <init>
>>> INFO: JMSArcRepositoryClient will retry a store 3 times and timeout on
>>> each try after 3600000 milliseconds, and timeout on each getrequest
>>> after
>>> 300000 milliseconds.
>>>
>>> veebiarhiiv HarvestControllerServer high_11
>>>     FOCUSED ReplicaA        8
>>> Oct 29, 2014 10:43:01 AM
>>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>> <init>
>>> INFO: Harvesting requires at least 400000000 bytes free.
>>>
>>> veebiarhiiv HarvestControllerServer high_11
>>>     FOCUSED ReplicaA        9
>>> Oct 29, 2014 10:43:01 AM
>>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>> <init>
>>> INFO: Serverdir: 'harvester_high_11'
>>>
>>> veebiarhiiv HarvestControllerServer high_11
>>>     FOCUSED ReplicaA        10
>>> Oct 29, 2014 10:43:01 AM
>>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>> <init>
>>> INFO: Bound to harvest channel 'FOCUSED'
>>>
>>> veebiarhiiv HarvestControllerServer high_11
>>>     FOCUSED ReplicaA        11
>>> Oct 29, 2014 10:43:01 AM
>>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>> <init>
>>> INFO: Starting HarvestControllerServer.
>>>
>>> veebiarhiiv HarvestControllerServer high_11
>>>     FOCUSED ReplicaA        12
>>> Oct 29, 2014 10:43:00 AM
>>> dk.netarkivet.common.distribute.JMSConnectionSunMQ
>>> getConnectionFactory
>>> INFO: Establishing SunMQ JMS Connection to 'localhost:7676'
>>>
>>> veebiarhiiv HarvestControllerServer high_11
>>>     FOCUSED ReplicaA        13
>>> Oct 29, 2014 10:43:00 AM
>>> dk.netarkivet.common.distribute.JMSConnectionSunMQ
>>> <init>
>>> INFO: Creating instance of
>>> dk.netarkivet.common.distribute.JMSConnectionSunMQ
>>>
>>> veebiarhiiv HarvestControllerServer high_11
>>>     FOCUSED ReplicaA        14
>>> Oct 29, 2014 10:43:00 AM dk.netarkivet.common.utils.ApplicationUtils
>>> logAndPrint
>>> INFO:
>>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>> Running
>>>
>>> veebiarhiiv HarvestControllerServer high_11
>>>     FOCUSED ReplicaA        15
>>> Oct 29, 2014 10:43:00 AM
>>> dk.netarkivet.common.management.MBeanConnectorCreator
>>> exposeJMXMBeanServer
>>> INFO: Registered mbean server in registry on port 5111 communicating on
>>> port 5211 using password file 'conf/jmxremote.password'.
>>> Service URL is
>>> service:jmx:rmi://veebiarhiiv.nlib.ee:5211/jndi/rmi://veebiarhiiv.nlib.
>>> ee:5111/jmxrmi
>>>
>>> veebiarhiiv HarvestControllerServer high_11
>>>     FOCUSED ReplicaA        16
>>> Oct 29, 2014 10:43:00 AM
>>> dk.netarkivet.monitor.distribute.JMSMonitorRegistryClient
>>> register
>>> INFO: Registering this client for monitoring every 1 minutes, using
>>> hostname
>>> 'veebiarhiiv.nlib.ee' and JMX/RMI ports 5111/5211
>>>
>>> veebiarhiiv HarvestControllerServer high_11
>>>     FOCUSED ReplicaA        17
>>> Oct 29, 2014 10:43:00 AM dk.netarkivet.common.utils.ApplicationUtils
>>> startApp
>>> INFO: Using settings files
>>> '/arhiiv/testcrawler/TESTCRAWLER/conf/settings_HarvestControllerApplica
>>> ti
>>> on_high_11.xml'
>>>
>>> veebiarhiiv HarvestControllerServer high_11
>>>     FOCUSED ReplicaA        18
>>> Oct 29, 2014 10:42:56 AM dk.netarkivet.common.utils.ApplicationUtils
>>> logAndPrint
>>> INFO: Starting
>>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>> Version: 4.4.0 status RELEASE
>>> ----------------------------------------------------------------------
>>>
>>> And the logfile for the hight_11 application:
>>>
>>> ----------------------------------------------------------------------
>>>
>>> Oct 29, 2014 10:42:57 AM dk.netarkivet.common.utils.Settings getAll
>>> FINE: Searching for a setting for key:
>>> settings.common.replicas.replica.replicaId
>>> Oct 29, 2014 10:42:57 AM dk.netarkivet.common.utils.Settings getAll
>>> FINE: Value found in loaded data: A
>>> Oct 29, 2014 10:42:56 AM dk.netarkivet.common.utils.ApplicationUtils
>>> logAndPrint
>>> INFO: Starting
>>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>> Version: 4.4.0 status RELEASE
>>> Oct 29, 2014 10:43:00 AM dk.netarkivet.common.utils.ApplicationUtils
>>> startApp
>>> INFO: Using settings files
>>> '/arhiiv/testcrawler/TESTCRAWLER/conf/settings_HarvestControllerApplica
>>> tion_high_11.xml'
>>> Oct 29, 2014 10:43:00 AM
>>> dk.netarkivet.monitor.distribute.JMSMonitorRegistryClient register
>>> INFO: Registering this client for monitoring every 1 minutes, using
>>> hostname 'veebiarhiiv.nlib.ee' and JMX/RMI ports 5111/5211
>>> Oct 29, 2014 10:43:00 AM
>>> dk.netarkivet.common.management.MBeanConnectorCreator
>>> exposeJMXMBeanServer
>>> INFO: Registered mbean server in registry on port 5111 communicating on
>>> port 5211 using password file 'conf/jmxremote.password'.
>>> Service URL is
>>> service:jmx:rmi://veebiarhiiv.nlib.ee:5211/jndi/rmi://veebiarhiiv.nlib.
>>> ee:5111/jmxrmi
>>> Oct 29, 2014 10:43:00 AM dk.netarkivet.common.utils.ApplicationUtils
>>> logAndPrint
>>> INFO:
>>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>> Running
>>> Oct 29, 2014 10:43:00 AM
>>> dk.netarkivet.common.distribute.JMSConnectionSunMQ <init>
>>> INFO: Creating instance of
>>> dk.netarkivet.common.distribute.JMSConnectionSunMQ
>>> Oct 29, 2014 10:43:00 AM dk.netarkivet.common.utils.Settings getAll
>>> FINE: Searching for a setting for key:
>>> settings.common.topLevelDomains.tld
>>> Oct 29, 2014 10:43:00 AM
>>> dk.netarkivet.common.distribute.JMSConnectionSunMQ getConnectionFactory
>>> INFO: Establishing SunMQ JMS Connection to 'localhost:7676'
>>> Oct 29, 2014 10:43:00 AM dk.netarkivet.common.utils.Settings getAll
>>> FINE: Value found in classpath data:
>>> ac,ad,ae,aero,af,ag,ai,al,am,an,ao,aq,ar,arpa,as,gv.at,ac.at,or.at,co.a
>>> t,biz.at,info.at,priv.at,at,au,aw,ax,az,ba,bb,bd,be,bf,bg,bh,bi,biz,bj,
>>> bm,bn,bo,br,bs,bt,bv,bw,by,bz,ca,cat,cc,cd,cf,cg,ch,ci,ck,cl,cm,cn,co,c
>>> om,coop,cr,cs,cu,cv,cx,cy,cz,de,dj,dk,dm,do,dz,ec,edu,ee,eg,eh,er,es,et
>>> ,eu,fi,fj,fk,fm,fo,aeroport.fr,asso.fr,avoues.fr,chambagri.fr,com.fr,go
>>> uv.fr,medecin.fr,nom.fr,pharmacien.fr,port.fr,prd.fr,presse.fr,tm.fr,fr
>>> ,ga,gb,gd,ge,gf,gg,gh,gi,gl,gm,gn,gov,gp,gq,gr,gs,gt,gu,gw,gy,hk,hm,hn,
>>> hr,ht,hu,id,ie,il,im,in,info,int,io,iq,ir,is,it,je,jm,jo,jobs,jp,ke,kg,
>>> kh,ki,km,kn,kp,kr,kw,ky,kz,la,lb,lc,li,lk,lr,ls,lt,lu,lv,ly,ma,mc,md,me
>>> ,mg,mh,mil,mk,ml,mm,mn,mo,mobi,mp,mq,mr,ms,mt,mu,museum,mv,mw,mx,my,mz,
>>> na,name,nc,ne,net,nf,ng,ni,nl,no,np,nr,nt,nu,nz,om,org,pa,pe,pf,pg,ph,p
>>> k,pl,pm,pn,pr,pro,ps,pt,pw,py,qa,asso.re,com.re,re,ro,ru,rw,sa,sb,sc,sd
>>> ,se,sg,sh,si,sj,sk,sl,sm,sn,so,sr,st,su,sv,sy,sz,tc,td,at.tf,net.tf,tf,
>>> tg,th,tj,tk,tl,tm,tn,to,tp,tr,travel,tt,tv,tw,tz,ua,ug,ac.uk,co.uk!
>>>   ,gov.uk,
>>> ltd.uk,me.uk,mod.uk,net.uk,nic.uk,nhs.uk,org.uk,plc.uk,police.uk,sch.uk
>>> ,govt.uk,orgn.uk,lea.uk,mil.uk,nel.uk,uk,us,uy,uz,va,vc,ve,vg,vi,vn,vu,
>>> wien,wf,ws,ye,yt,yu,za,zm,zw
>>> Oct 29, 2014 10:43:01 AM
>>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>> <init>
>>> INFO: Starting HarvestControllerServer.
>>> Oct 29, 2014 10:43:01 AM
>>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>> <init>
>>> INFO: Bound to harvest channel 'FOCUSED'
>>> Oct 29, 2014 10:43:01 AM
>>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>> <init>
>>> INFO: Serverdir: 'harvester_high_11'
>>> Oct 29, 2014 10:43:01 AM
>>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>> <init>
>>> INFO: Harvesting requires at least 400000000 bytes free.
>>> Oct 29, 2014 10:43:01 AM
>>> dk.netarkivet.archive.arcrepository.distribute.JMSArcRepositoryClient
>>> <init>
>>> INFO: JMSArcRepositoryClient will retry a store 3 times and timeout on
>>> each try after 3600000 milliseconds, and timeout on each getrequest
>>> after 300000 milliseconds.
>>> Oct 29, 2014 10:43:01 AM
>>> dk.netarkivet.archive.arcrepository.distribute.JMSArcRepositoryClient
>>> <init>
>>> INFO: JMSArcRepository listens for replies on channel '[Queue
>>> 'TESTCRAWLER_COMMON_THIS_REPOS_CLIENT_127_0_1_1_HCS_HIGH_11']'
>>> Oct 29, 2014 10:43:01 AM
>>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>> <init>
>>> FINE: Obtained JMS connection.
>>> Oct 29, 2014 10:43:01 AM
>>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>> visit
>>> SEVERE: Received message stating that channel 'FOCUSED' is invalid.
>>> Will
>>> stop.
>>> Oct 29, 2014 10:43:01 AM
>>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>> close
>>> INFO: Closing HarvestControllerServer.
>>> Oct 29, 2014 10:43:01 AM dk.netarkivet.common.distribute.JMSConnection
>>> removeListener
>>> INFO: Removing listener from channel
>>> 'TESTCRAWLER_COMMON_THIS_REPOS_CLIENT_127_0_1_1_HCS_HIGH_11'
>>> Oct 29, 2014 10:43:01 AM dk.netarkivet.common.distribute.JMSConnection
>>> removeListener
>>> INFO: Removing listener from channel
>>> 'TESTCRAWLER_COMMON_HCHAN_VAL_RESP'
>>> Oct 29, 2014 10:43:01 AM
>>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>> close
>>> INFO: Closed down HarvestControllerServer
>>> Oct 29, 2014 10:43:01 AM
>>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>>> <init>
>>> INFO: Requested to check the validity of harvest channel 'FOCUSED'
>>>
>>> ----------------------------------------------------------------------
>>>
>>>
>>> Setup uses postgresql with two databases:
>>> * crawler (all the harvesting data)
>>> * crawleradmin (all the admin related data)
>>>
>>> There seems to be no connection errors related to the database. Data is
>>> read and written there.
>>>
>>> Have I missed something while installing the software?
>>>
>>> steps taken:
>>>
>>> 1. Installed and started MQ
>>> 2. Created and modified deploy xml to fit my needs (db info, 20 harvest
>>> applications)
>>> 3. Installed the application
>>> 4. updated database and created index (according to the manual)
>>> 5. uploaded deploy xml
>>> 6. started with startall script
>>>
>>> Can access web interface, can add harvest definitions, can edit all
>>> data
>>> that can be edited. Running jobs however stops at status "new".
>>>
>>> To be honest ... I have no idea what to check next. Any help on this
>>> issue is welcome :)
>>>
>>>
>>>
>>> -----------------------------------------------------
>>> Meelis Mihhailov
>>> Süsteemiadministraator / Systemadministrator
>>> Eesti Rahvusraamatukogu / National Library Of Estonia
>>>
>>> Telefon: 630 7178 / Phone: +372 630 7178
>>> E-post: meelis at nlib.ee / E-mail: meelis at nlib.ee
>>>
>>> Tõnismägi 2, 15189 Tallinn, ESTONIA
>>>
>>> www.eestirahvusraamatukogu.ee
>>> -----------------------------------------------------
>>> _______________________________________________
>>> NetarchiveSuite-users mailing list
>>> NetarchiveSuite-users at ml.sbforge.org
>>> http://ml.sbforge.org/mailman/listinfo/netarchivesuite-users
>>
>> _______________________________________________
>> NetarchiveSuite-users mailing list
>> NetarchiveSuite-users at ml.sbforge.org
>> http://ml.sbforge.org/mailman/listinfo/netarchivesuite-users
>>
>_______________________________________________
>NetarchiveSuite-users mailing list
>NetarchiveSuite-users at ml.sbforge.org
>http://ml.sbforge.org/mailman/listinfo/netarchivesuite-users




More information about the NetarchiveSuite-users mailing list