[Netarchivesuite-users] NetarchiveSuite Version: 4.4.0 does not run jobs

Meelis Mihhailov meelis at nlib.ee
Wed Oct 29 11:17:09 CET 2014


Hi Nicholas

Checked it and there are two values present :)


crawler=# SELECT * from harvestchannel;
  id |   name   | issnapshot | isdefault |            comments
----+----------+------------+-----------+--------------------------------
   2 | FOCUSED  | f          | t         | Channel for selective harvests
   1 | SNAPSHOT | t          | t         | Channel for snapshot harvests
(2 rows)



-----------------------------------------------------
Meelis Mihhailov
Süsteemiadministraator / Systemadministrator
Eesti Rahvusraamatukogu / National Library Of Estonia

Telefon: 630 7178 / Phone: +372 630 7178
E-post: meelis at nlib.ee / E-mail: meelis at nlib.ee

Tõnismägi 2, 15189 Tallinn, ESTONIA

www.eestirahvusraamatukogu.ee
-----------------------------------------------------

On 29.10.2014 12:03, Nicholas Clarke wrote:
> It looks like the harvestchannel table in the postgresql database is empty?
>
> -Nicholas
>
>> -----Oprindelig meddelelse-----
>> Fra: NetarchiveSuite-users [mailto:netarchivesuite-users-
>> bounces at ml.sbforge.org] På vegne af Meelis Mihhailov
>> Sendt: 29. oktober 2014 10:42
>> Til: netarchive Suite Users
>> Emne: [Netarchivesuite-users] NetarchiveSuite Version: 4.4.0 does not
>> run jobs
>>
>> Hi all!
>>
>> Installed version 4.4.0 with quick start setup running on PostgreSQL
>> database. Installed all the needed sql files, did the updates, added
>> indexes and after running the start all script I can access web
>> interface and add definitions, configurations etc.
>>
>> Problem is when I activate a job it wont go past "new" status.
>>
>> Checked the logs and in the
>> HarvestJobManagerApplication_testcrawler0.log.0 file I can see this:
>>
>> ----------------------------------------------------------------------
>> FINE: Creating Job 1 (state = NEW, HD = 1, channel = FOCUSED, snapshot
>> =
>> false, forcemaxcount = -1, forcemaxbytes = 1000000000,
>> forcemaxrunningtime = 0, orderxml = default_orderxml, numconfigs = 1,
>> created = Wed Oct 29 10:48:13 EET 2014)
>> Oct 29, 2014 10:48:13 AM dk.netarkivet.harvester.datamodel.Job
>> getHarvestFilenamePrefix
>> WARNING: HarvestnamePrefix not yet set for job 1. Set it by using the
>> naming scheme. This should only happen for old jobs being read
>> Oct 29, 2014 10:48:13 AM dk.netarkivet.harvester.datamodel.Job
>> setDefaultHarvestNamePrefix
>> FINE: Applying the default ArchiveFileNaming class
>> 'dk.netarkivet.harvester.harvesting.LegacyNamingConvention'.
>> Oct 29, 2014 10:48:13 AM dk.netarkivet.harvester.datamodel.Job
>> setDefaultHarvestNamePrefix
>> FINE: The harvestPrefix of this job is: 1-1
>> Oct 29, 2014 10:48:14 AM
>> dk.netarkivet.harvester.scheduler.jobgen.DefaultJobGenerator
>> processDomainConfigurationSubset
>> FINE: Created # 1 jobs for harvest # 1
>> Oct 29, 2014 10:48:14 AM
>> dk.netarkivet.harvester.scheduler.jobgen.AbstractJobGenerator
>> generateJobs
>> INFO: Finished generating 1 jobs for harvestdefinition # 1
>> Oct 29, 2014 10:48:14 AM
>> dk.netarkivet.harvester.scheduler.HarvestJobGenerator$JobGeneratorTask$
>> 1 run
>> INFO: Created 1 jobs for harvest definition (MEELIS)
>> Oct 29, 2014 10:48:14 AM
>> dk.netarkivet.harvester.datamodel.HarvestDefinitionDBDAO update
>> FINE: 1 partialharvests records updated
>> Oct 29, 2014 10:48:14 AM
>> dk.netarkivet.harvester.scheduler.HarvestJobGenerator$JobGeneratorTask$
>> 1 run
>> FINE: Removed HD #1(MEELIS) from list of harvestdefinitions to be
>> scheduled. Harvestdefinitions still to be scheduled: []
>> ---------------------------------------------------------------------
>>
>> System state says that the job has been created but from that moment
>> ...
>> nothing happens. It just stays there with status "New" and NAS is doing
>> nothing.
>>
>> There are however some interesting statuses in the system state for
>> some
>> of the harvest applications. For example application high_11:
>>
>> -------------------------------------------------------------------
>> veebiarhiiv	HarvestControllerServer	high_11
>> 	FOCUSED	ReplicaA	0
>> Oct 29, 2014 10:43:01 AM
>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>> <init>
>> INFO: Requested to check the validity of harvest channel 'FOCUSED'
>>
>> veebiarhiiv	HarvestControllerServer	high_11
>> 	FOCUSED	ReplicaA	1
>> Oct 29, 2014 10:43:01 AM
>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>> close
>> INFO: Closed down HarvestControllerServer
>>
>> veebiarhiiv	HarvestControllerServer	high_11
>> 	FOCUSED	ReplicaA	2
>> Oct 29, 2014 10:43:01 AM dk.netarkivet.common.distribute.JMSConnection
>> removeListener
>> INFO: Removing listener from channel
>> 'TESTCRAWLER_COMMON_HCHAN_VAL_RESP'
>>
>> veebiarhiiv	HarvestControllerServer	high_11
>> 	FOCUSED	ReplicaA	3
>> Oct 29, 2014 10:43:01 AM dk.netarkivet.common.distribute.JMSConnection
>> removeListener
>> INFO: Removing listener from channel
>> 'TESTCRAWLER_COMMON_THIS_REPOS_CLIENT_127_0_1_1_HCS_HIGH_11'
>>
>> veebiarhiiv	HarvestControllerServer	high_11
>> 	FOCUSED	ReplicaA	4
>> Oct 29, 2014 10:43:01 AM
>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>> close
>> INFO: Closing HarvestControllerServer.
>>
>> veebiarhiiv	HarvestControllerServer	high_11
>> 	FOCUSED	ReplicaA	5
>> Oct 29, 2014 10:43:01 AM
>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>> visit
>> SEVERE: Received message stating that channel 'FOCUSED' is invalid.
>> Will
>> stop.
>>
>> veebiarhiiv	HarvestControllerServer	high_11
>> 	FOCUSED	ReplicaA	6
>> Oct 29, 2014 10:43:01 AM
>> dk.netarkivet.archive.arcrepository.distribute.JMSArcRepositoryClient
>> <init>
>> INFO: JMSArcRepository listens for replies on channel '[Queue
>> 'TESTCRAWLER_COMMON_THIS_REPOS_CLIENT_
>> 127_0_1_1_HCS_HIGH_11']'
>>
>> veebiarhiiv	HarvestControllerServer	high_11
>> 	FOCUSED	ReplicaA	7
>> Oct 29, 2014 10:43:01 AM
>> dk.netarkivet.archive.arcrepository.distribute.JMSArcRepositoryClient
>> <init>
>> INFO: JMSArcRepositoryClient will retry a store 3 times and timeout on
>> each try after 3600000 milliseconds, and timeout on each getrequest
>> after
>> 300000 milliseconds.
>>
>> veebiarhiiv	HarvestControllerServer	high_11
>> 	FOCUSED	ReplicaA	8
>> Oct 29, 2014 10:43:01 AM
>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>> <init>
>> INFO: Harvesting requires at least 400000000 bytes free.
>>
>> veebiarhiiv	HarvestControllerServer	high_11
>> 	FOCUSED	ReplicaA	9
>> Oct 29, 2014 10:43:01 AM
>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>> <init>
>> INFO: Serverdir: 'harvester_high_11'
>>
>> veebiarhiiv	HarvestControllerServer	high_11
>> 	FOCUSED	ReplicaA	10
>> Oct 29, 2014 10:43:01 AM
>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>> <init>
>> INFO: Bound to harvest channel 'FOCUSED'
>>
>> veebiarhiiv	HarvestControllerServer	high_11
>> 	FOCUSED	ReplicaA	11
>> Oct 29, 2014 10:43:01 AM
>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>> <init>
>> INFO: Starting HarvestControllerServer.
>>
>> veebiarhiiv	HarvestControllerServer	high_11
>> 	FOCUSED	ReplicaA	12
>> Oct 29, 2014 10:43:00 AM
>> dk.netarkivet.common.distribute.JMSConnectionSunMQ
>> getConnectionFactory
>> INFO: Establishing SunMQ JMS Connection to 'localhost:7676'
>>
>> veebiarhiiv	HarvestControllerServer	high_11
>> 	FOCUSED	ReplicaA	13
>> Oct 29, 2014 10:43:00 AM
>> dk.netarkivet.common.distribute.JMSConnectionSunMQ
>> <init>
>> INFO: Creating instance of
>> dk.netarkivet.common.distribute.JMSConnectionSunMQ
>>
>> veebiarhiiv	HarvestControllerServer	high_11
>> 	FOCUSED	ReplicaA	14
>> Oct 29, 2014 10:43:00 AM dk.netarkivet.common.utils.ApplicationUtils
>> logAndPrint
>> INFO:
>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>> Running
>>
>> veebiarhiiv	HarvestControllerServer	high_11
>> 	FOCUSED	ReplicaA	15
>> Oct 29, 2014 10:43:00 AM
>> dk.netarkivet.common.management.MBeanConnectorCreator
>> exposeJMXMBeanServer
>> INFO: Registered mbean server in registry on port 5111 communicating on
>> port 5211 using password file 'conf/jmxremote.password'.
>> Service URL is
>> service:jmx:rmi://veebiarhiiv.nlib.ee:5211/jndi/rmi://veebiarhiiv.nlib.
>> ee:5111/jmxrmi
>>
>> veebiarhiiv	HarvestControllerServer	high_11
>> 	FOCUSED	ReplicaA	16
>> Oct 29, 2014 10:43:00 AM
>> dk.netarkivet.monitor.distribute.JMSMonitorRegistryClient
>> register
>> INFO: Registering this client for monitoring every 1 minutes, using
>> hostname
>> 'veebiarhiiv.nlib.ee' and JMX/RMI ports 5111/5211
>>
>> veebiarhiiv	HarvestControllerServer	high_11
>> 	FOCUSED	ReplicaA	17
>> Oct 29, 2014 10:43:00 AM dk.netarkivet.common.utils.ApplicationUtils
>> startApp
>> INFO: Using settings files
>> '/arhiiv/testcrawler/TESTCRAWLER/conf/settings_HarvestControllerApplica
>> ti
>> on_high_11.xml'
>>
>> veebiarhiiv	HarvestControllerServer	high_11
>> 	FOCUSED	ReplicaA	18
>> Oct 29, 2014 10:42:56 AM dk.netarkivet.common.utils.ApplicationUtils
>> logAndPrint
>> INFO: Starting
>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>> Version: 4.4.0 status RELEASE
>> ----------------------------------------------------------------------
>>
>> And the logfile for the hight_11 application:
>>
>> ----------------------------------------------------------------------
>>
>> Oct 29, 2014 10:42:57 AM dk.netarkivet.common.utils.Settings getAll
>> FINE: Searching for a setting for key:
>> settings.common.replicas.replica.replicaId
>> Oct 29, 2014 10:42:57 AM dk.netarkivet.common.utils.Settings getAll
>> FINE: Value found in loaded data: A
>> Oct 29, 2014 10:42:56 AM dk.netarkivet.common.utils.ApplicationUtils
>> logAndPrint
>> INFO: Starting
>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>> Version: 4.4.0 status RELEASE
>> Oct 29, 2014 10:43:00 AM dk.netarkivet.common.utils.ApplicationUtils
>> startApp
>> INFO: Using settings files
>> '/arhiiv/testcrawler/TESTCRAWLER/conf/settings_HarvestControllerApplica
>> tion_high_11.xml'
>> Oct 29, 2014 10:43:00 AM
>> dk.netarkivet.monitor.distribute.JMSMonitorRegistryClient register
>> INFO: Registering this client for monitoring every 1 minutes, using
>> hostname 'veebiarhiiv.nlib.ee' and JMX/RMI ports 5111/5211
>> Oct 29, 2014 10:43:00 AM
>> dk.netarkivet.common.management.MBeanConnectorCreator
>> exposeJMXMBeanServer
>> INFO: Registered mbean server in registry on port 5111 communicating on
>> port 5211 using password file 'conf/jmxremote.password'.
>> Service URL is
>> service:jmx:rmi://veebiarhiiv.nlib.ee:5211/jndi/rmi://veebiarhiiv.nlib.
>> ee:5111/jmxrmi
>> Oct 29, 2014 10:43:00 AM dk.netarkivet.common.utils.ApplicationUtils
>> logAndPrint
>> INFO:
>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>> Running
>> Oct 29, 2014 10:43:00 AM
>> dk.netarkivet.common.distribute.JMSConnectionSunMQ <init>
>> INFO: Creating instance of
>> dk.netarkivet.common.distribute.JMSConnectionSunMQ
>> Oct 29, 2014 10:43:00 AM dk.netarkivet.common.utils.Settings getAll
>> FINE: Searching for a setting for key:
>> settings.common.topLevelDomains.tld
>> Oct 29, 2014 10:43:00 AM
>> dk.netarkivet.common.distribute.JMSConnectionSunMQ getConnectionFactory
>> INFO: Establishing SunMQ JMS Connection to 'localhost:7676'
>> Oct 29, 2014 10:43:00 AM dk.netarkivet.common.utils.Settings getAll
>> FINE: Value found in classpath data:
>> ac,ad,ae,aero,af,ag,ai,al,am,an,ao,aq,ar,arpa,as,gv.at,ac.at,or.at,co.a
>> t,biz.at,info.at,priv.at,at,au,aw,ax,az,ba,bb,bd,be,bf,bg,bh,bi,biz,bj,
>> bm,bn,bo,br,bs,bt,bv,bw,by,bz,ca,cat,cc,cd,cf,cg,ch,ci,ck,cl,cm,cn,co,c
>> om,coop,cr,cs,cu,cv,cx,cy,cz,de,dj,dk,dm,do,dz,ec,edu,ee,eg,eh,er,es,et
>> ,eu,fi,fj,fk,fm,fo,aeroport.fr,asso.fr,avoues.fr,chambagri.fr,com.fr,go
>> uv.fr,medecin.fr,nom.fr,pharmacien.fr,port.fr,prd.fr,presse.fr,tm.fr,fr
>> ,ga,gb,gd,ge,gf,gg,gh,gi,gl,gm,gn,gov,gp,gq,gr,gs,gt,gu,gw,gy,hk,hm,hn,
>> hr,ht,hu,id,ie,il,im,in,info,int,io,iq,ir,is,it,je,jm,jo,jobs,jp,ke,kg,
>> kh,ki,km,kn,kp,kr,kw,ky,kz,la,lb,lc,li,lk,lr,ls,lt,lu,lv,ly,ma,mc,md,me
>> ,mg,mh,mil,mk,ml,mm,mn,mo,mobi,mp,mq,mr,ms,mt,mu,museum,mv,mw,mx,my,mz,
>> na,name,nc,ne,net,nf,ng,ni,nl,no,np,nr,nt,nu,nz,om,org,pa,pe,pf,pg,ph,p
>> k,pl,pm,pn,pr,pro,ps,pt,pw,py,qa,asso.re,com.re,re,ro,ru,rw,sa,sb,sc,sd
>> ,se,sg,sh,si,sj,sk,sl,sm,sn,so,sr,st,su,sv,sy,sz,tc,td,at.tf,net.tf,tf,
>> tg,th,tj,tk,tl,tm,tn,to,tp,tr,travel,tt,tv,tw,tz,ua,ug,ac.uk,co.uk!
>>   ,gov.uk,
>> ltd.uk,me.uk,mod.uk,net.uk,nic.uk,nhs.uk,org.uk,plc.uk,police.uk,sch.uk
>> ,govt.uk,orgn.uk,lea.uk,mil.uk,nel.uk,uk,us,uy,uz,va,vc,ve,vg,vi,vn,vu,
>> wien,wf,ws,ye,yt,yu,za,zm,zw
>> Oct 29, 2014 10:43:01 AM
>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>> <init>
>> INFO: Starting HarvestControllerServer.
>> Oct 29, 2014 10:43:01 AM
>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>> <init>
>> INFO: Bound to harvest channel 'FOCUSED'
>> Oct 29, 2014 10:43:01 AM
>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>> <init>
>> INFO: Serverdir: 'harvester_high_11'
>> Oct 29, 2014 10:43:01 AM
>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>> <init>
>> INFO: Harvesting requires at least 400000000 bytes free.
>> Oct 29, 2014 10:43:01 AM
>> dk.netarkivet.archive.arcrepository.distribute.JMSArcRepositoryClient
>> <init>
>> INFO: JMSArcRepositoryClient will retry a store 3 times and timeout on
>> each try after 3600000 milliseconds, and timeout on each getrequest
>> after 300000 milliseconds.
>> Oct 29, 2014 10:43:01 AM
>> dk.netarkivet.archive.arcrepository.distribute.JMSArcRepositoryClient
>> <init>
>> INFO: JMSArcRepository listens for replies on channel '[Queue
>> 'TESTCRAWLER_COMMON_THIS_REPOS_CLIENT_127_0_1_1_HCS_HIGH_11']'
>> Oct 29, 2014 10:43:01 AM
>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>> <init>
>> FINE: Obtained JMS connection.
>> Oct 29, 2014 10:43:01 AM
>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>> visit
>> SEVERE: Received message stating that channel 'FOCUSED' is invalid.
>> Will
>> stop.
>> Oct 29, 2014 10:43:01 AM
>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>> close
>> INFO: Closing HarvestControllerServer.
>> Oct 29, 2014 10:43:01 AM dk.netarkivet.common.distribute.JMSConnection
>> removeListener
>> INFO: Removing listener from channel
>> 'TESTCRAWLER_COMMON_THIS_REPOS_CLIENT_127_0_1_1_HCS_HIGH_11'
>> Oct 29, 2014 10:43:01 AM dk.netarkivet.common.distribute.JMSConnection
>> removeListener
>> INFO: Removing listener from channel
>> 'TESTCRAWLER_COMMON_HCHAN_VAL_RESP'
>> Oct 29, 2014 10:43:01 AM
>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>> close
>> INFO: Closed down HarvestControllerServer
>> Oct 29, 2014 10:43:01 AM
>> dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer
>> <init>
>> INFO: Requested to check the validity of harvest channel 'FOCUSED'
>>
>> ----------------------------------------------------------------------
>>
>>
>> Setup uses postgresql with two databases:
>> * crawler (all the harvesting data)
>> * crawleradmin (all the admin related data)
>>
>> There seems to be no connection errors related to the database. Data is
>> read and written there.
>>
>> Have I missed something while installing the software?
>>
>> steps taken:
>>
>> 1. Installed and started MQ
>> 2. Created and modified deploy xml to fit my needs (db info, 20 harvest
>> applications)
>> 3. Installed the application
>> 4. updated database and created index (according to the manual)
>> 5. uploaded deploy xml
>> 6. started with startall script
>>
>> Can access web interface, can add harvest definitions, can edit all
>> data
>> that can be edited. Running jobs however stops at status "new".
>>
>> To be honest ... I have no idea what to check next. Any help on this
>> issue is welcome :)
>>
>>
>>
>> -----------------------------------------------------
>> Meelis Mihhailov
>> Süsteemiadministraator / Systemadministrator
>> Eesti Rahvusraamatukogu / National Library Of Estonia
>>
>> Telefon: 630 7178 / Phone: +372 630 7178
>> E-post: meelis at nlib.ee / E-mail: meelis at nlib.ee
>>
>> Tõnismägi 2, 15189 Tallinn, ESTONIA
>>
>> www.eestirahvusraamatukogu.ee
>> -----------------------------------------------------
>> _______________________________________________
>> NetarchiveSuite-users mailing list
>> NetarchiveSuite-users at ml.sbforge.org
>> http://ml.sbforge.org/mailman/listinfo/netarchivesuite-users
>
> _______________________________________________
> NetarchiveSuite-users mailing list
> NetarchiveSuite-users at ml.sbforge.org
> http://ml.sbforge.org/mailman/listinfo/netarchivesuite-users
>


More information about the NetarchiveSuite-users mailing list