[Netarchivesuite-devel] Multiple jobs submitted simultaneously under 5.3.1

sara.aubry at bnf.fr sara.aubry at bnf.fr
Mon Nov 13 15:30:25 CET 2017


Hi everyone,

It seems that we have run into the same issue as Andreas with 5.3.1.

For the second time in a week, we had an issue with the job generation 
process of our daily crawls. 
Two jobs with the same ID were generated within the same second causing 
the harvest to be deactivated after the second run.
As it happened over the weekend, it took us a while to notice the problem 
and we missed a few daily crawls...
We don't understand why this problem is coming up now although we have 
been using 5.3.1 since July...

Were there any reason to change the job 5.3.0 generation code?

Sara

10:02:04.577 INFO  d.n.h.scheduler.HarvestJobGenerator - Starting to 
create jobs for harvest definition #28(BnF actualites quotidienne micro)
10:02:04.578 INFO  d.n.h.s.jobgen.AbstractJobGenerator - Generating jobs 
for harvestdefinition #28
10:02:04.672 INFO  d.n.h.datamodel.H3HeritrixTemplate - Inserting 681 
crawlertraps into the template
[...]
10:02:04.718 WARN  d.n.h.datamodel.HeritrixTemplate - Found empty trap for 
domain
10:02:04.718 INFO  d.n.h.s.jobgen.AbstractJobGenerator - Finished 
generating 0 jobs for harvestdefinition #28
10:02:05.630 INFO  d.n.h.scheduler.HarvestJobGenerator - Starting to 
create jobs for harvest definition #28(BnF actualites quotidienne micro)
10:02:05.630 INFO  d.n.h.s.jobgen.AbstractJobGenerator - Generating jobs 
for harvestdefinition #28
10:02:06.107 INFO  d.n.h.scheduler.HarvestJobGenerator - Created 1 jobs 
for harvest definition (BnF actualites quotidienne micro)
10:02:06.108 WARN  d.n.harvester.datamodel.JobDBDAO - The jobId for the 
job is already set. This should probably never happen.
10:02:06.109 WARN  d.n.harvester.datamodel.JobDBDAO - The creation time 
for the job is already set. This should probably never happen.
10:02:06.126 WARN  d.n.harvester.datamodel.JobDBDAO - SQL error creating 
job Job 24913 (state = NEW, HD = 28, channel = CIBLEE, snapshot = false, 
forcemaxcount = 10000, forcemaxbytes = -1, forcemaxrunningtime = 0, 
orderxml = page+1actu, numconfigs = 86, created = Sat Nov 11 10:02:05 CET 
2017) in database
SQLException trace:
SQL State:23505
Error Code:0
org.postgresql.util.PSQLException: ERREUR: la valeur d'une clé dupliquée 
rompt la contrainte unique « jobs_pkey »
  Detail: La clé « (job_id)=(24913) » existe déjà.
        at 
org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2157)
        at 
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1886)
        at 
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255)
        at 
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:555)
        at 
org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:417)
        at 
org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:363)
        at 
com.mchange.v2.c3p0.impl.NewProxyPreparedStatement.executeUpdate(NewProxyPreparedStatement.java:147)
        at 
dk.netarkivet.harvester.datamodel.JobDBDAO.create(JobDBDAO.java:154)
        at 
dk.netarkivet.harvester.scheduler.jobgen.FixedDomainConfigurationCountJobGenerator.processDomainConfigurationSubset(FixedDomainConfigurationCountJobGenerator.java:237)
        at 
dk.netarkivet.harvester.scheduler.jobgen.AbstractJobGenerator.generateJobs(AbstractJobGenerator.java:96)
        at 
dk.netarkivet.harvester.scheduler.jobgen.FixedDomainConfigurationCountJobGenerator.generateJobs(FixedDomainConfigurationCountJobGenerator.java:186)
        at 
dk.netarkivet.harvester.scheduler.HarvestJobGenerator$JobGeneratorTask$JobGeneratorThread.run(HarvestJobGenerator.java:236)
End of SQLException trace
org.postgresql.util.PSQLException: ERREUR: la valeur d'une clé dupliquée 
rompt la contrainte unique « jobs_pkey »
  Detail: La clé « (job_id)=(24913) » existe déjà.
        at 
org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2157) 
~[postgresql-9.2-1003-jdbc4.jar:na]
        at 
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1886) 
~[postgresql-9.2-1003-jdbc4.jar:na]
        at 
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255) 
~[postgresql-9.2-1003-jdbc4.jar:na]
        at 
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:555) 
~[postgresql-9.2-1003-jdbc4.jar:na]
        at 
org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:417) 
~[postgresql-9.2-1003-jdbc4.jar:na]
        at 
org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:363) 
~[postgresql-9.2-1003-jdbc4.jar:na]
        at 
com.mchange.v2.c3p0.impl.NewProxyPreparedStatement.executeUpdate(NewProxyPreparedStatement.java:147) 
~[c3p0-0.9.2.1.jar:0.9.2.1]
        at 
dk.netarkivet.harvester.datamodel.JobDBDAO.create(JobDBDAO.java:154) 
~[harvester-core-5.3.1.jar:c5b46da8634b369cb839be3e0e7e215ed1b7dd98]
        at 
dk.netarkivet.harvester.scheduler.jobgen.FixedDomainConfigurationCountJobGenerator.processDomainConfigurationSubset(FixedDomainConfigurationCountJobGenerator.java:237) 
[harvest-scheduler-5.3.1-BNF.jar:c5b46da8634b369cb839be3e0e7e215ed1b7dd98]
        at 
dk.netarkivet.harvester.scheduler.jobgen.AbstractJobGenerator.generateJobs(AbstractJobGenerator.java:96) 
[harvest-scheduler-5.3.1-BNF.jar:c5b46da8634b369cb839be3e0e7e215ed1b7dd98]
        at 
dk.netarkivet.harvester.scheduler.jobgen.FixedDomainConfigurationCountJobGenerator.generateJobs(FixedDomainConfigurationCountJobGenerator.java:186) 
[harvest-scheduler-5.3.1-BNF.jar:c5b46da8634b369cb839be3e0e7e215ed1b7dd98]
        at 
dk.netarkivet.harvester.scheduler.HarvestJobGenerator$JobGeneratorTask$JobGeneratorThread.run(HarvestJobGenerator.java:236) 
[harvest-scheduler-5.3.1-BNF.jar:c5b46da8634b369cb839be3e0e7e215ed1b7dd98]
10:02:07.056 WARN  d.n.h.scheduler.HarvestJobGenerator - Exception while 
scheduling harvestdefinition #28(BnF actualites quotidienne micro). The 
harvestdefinition has been deactivated!
java.util.NoSuchElementException: No job generation state for harvest 28
        at 
dk.netarkivet.harvester.scheduler.jobgen.FixedDomainConfigurationCountJobGenerator.dropStateForHarvest(FixedDomainConfigurationCountJobGenerator.java:292) 
~[harvest-scheduler-5.3.1-BNF.jar:c5b46da8634b369cb839be3e0e7e215ed1b7dd98]
        at 
dk.netarkivet.harvester.scheduler.jobgen.FixedDomainConfigurationCountJobGenerator.generateJobs(FixedDomainConfigurationCountJobGenerator.java:203) 
~[harvest-scheduler-5.3.1-BNF.jar:c5b46da8634b369cb839be3e0e7e215ed1b7dd98]
        at 
dk.netarkivet.harvester.scheduler.HarvestJobGenerator$JobGeneratorTask$JobGeneratorThread.run(HarvestJobGenerator.java:236) 
~[harvest-scheduler-5.3.1-BNF.jar:c5b46da8634b369cb839be3e0e7e215ed1b7dd98]
10:02:07.071 ERROR d.n.common.utils.EMailNotifications - Mailing 
NetarchiveSuite-ERROR: Exception while scheduling harvestdefinition 
#28(BnF actualites quotidienne micro). The harvestdefinition has been 
deactivated!
java.util.NoSuchElementException: No job generation state for harvest 28
        at 
dk.netarkivet.harvester.scheduler.jobgen.FixedDomainConfigurationCountJobGenerator.dropStateForHarvest(FixedDomainConfigurationCountJobGenerator.java:292) 
~[harvest-scheduler-5.3.1-BNF.jar:c5b46da8634b369cb839be3e0e7e215ed1b7dd98]
        at 
dk.netarkivet.harvester.scheduler.jobgen.FixedDomainConfigurationCountJobGenerator.generateJobs(FixedDomainConfigurationCountJobGenerator.java:203) 
~[harvest-scheduler-5.3.1-BNF.jar:c5b46da8634b369cb839be3e0e7e215ed1b7dd98]
        at 
dk.netarkivet.harvester.scheduler.HarvestJobGenerator$JobGeneratorTask$JobGeneratorThread.run(HarvestJobGenerator.java:236) 
~[harvest-scheduler-5.3.1-BNF.jar:c5b46da8634b369cb839be3e0e7e215ed1b7dd98]
10:04:19.344 INFO  d.n.h.scheduler.JobDispatcher - Added 
duplicateReductionMetadataEntry metadataEntry for job 24913
10:04:19.349 INFO  d.n.h.scheduler.JobDispatcher - As we're using WARC as 
archiveFormat WarcInfoMetadata is now added to the template
10:04:19.349 INFO  d.n.h.datamodel.H3HeritrixTemplate - Adding 
WarcInfoMetadata <property name="metadataItems">
<map>







De :    Colin Samuel Rosenthal <csr at kb.dk>
A :     "netarchivesuite-devel at ml.sbforge.org" 
<netarchivesuite-devel at ml.sbforge.org>
Date :  08/11/2017 12:56
Objet : Re: [Netarchivesuite-devel] Multiple jobs submitted simultaneously 
under 5.3.1
Envoyé par :    "Netarchivesuite-devel" 
<netarchivesuite-devel-bounces at ml.sbforge.org>



I've created an issue https://sbforge.org/jira/browse/NAS-2682 for this 
and I have some suspicions about whose code might be responsible for the 
problem, although right now I can't see anything obviously wrong.

--
Colin Rosenthal PhD
Senior IT Consultant
Royal Danish Library (Aarhus)

From: Netarchivesuite-devel <netarchivesuite-devel-bounces at ml.sbforge.org> 
on behalf of sara.aubry at bnf.fr <sara.aubry at bnf.fr>
Sent: Tuesday, July 11, 2017 11:18:10 AM
To: netarchivesuite-devel at ml.sbforge.org
Subject: Re: [Netarchivesuite-devel] Multiple jobs submitted 
simultaneously under 5.3.1 
 
Hi everyone,

Just a  quick note to let you know that we have launched a broad crawl 
test with 5.3.1 at the end of last week.
And everything went smooth: we generated 872 jobs, ran 20 of them using 10 
crawlers, job status are consistent
and there is nothing wrong with the broker.

We have the following configuration:
-  CentOS 7.3 (which seems to be similar to Red Hat 4.8)
- Java(TM) SE Runtime Environment (build 1.8.0_40-b25)  64-Bit
- OpenMQ (MessageQueue5.1)


Maybe more important, we are using this configuration on the scheduler.
            <scheduler>
                <jobtimeouttime>31536000</jobtimeouttime>
                <jobgenerationperiode>60</jobgenerationperiode>
                <jobGen>
 
<class>dk.netarkivet.harvester.scheduler.jobgen.FixedDomainConfigurationCountJobGenerator</class>
 <objectLimitIsSetByQuotaEnforcer>false</objectLimitIsSetByQuotaEnforcer>
                    <domainConfigSubsetSize>5000</domainConfigSubsetSize>
                    <config>
 <fixedDomainCountSnapshot>5000</fixedDomainCountSnapshot>
 <fixedDomainCountFocused>500</fixedDomainCountFocused>
 <excludeDomainsWithZeroBudget>true</excludeDomainsWithZeroBudget>
 <postponeUnregisteredChannel>false</postponeUnregisteredChannel>
                    </config>
                </jobGen>
            </scheduler>

If I remember well, at KB and ONB, you are using a different job generator 
that tries to make homogenous jobs sizes based
on the previous harvest. The one we are using is making jobs taking the 
domains in alphabetical order.

Hope this help,

Sara



De :        <aponb at gmx.at>
A :        <netarchivesuite-devel at ml.sbforge.org>
Date :        29/06/2017 11:13
Objet :        Re: [Netarchivesuite-devel] Multiple jobs submitted 
simultaneously under 5.3.1
Envoyé par :        Netarchivesuite-devel 
<netarchivesuite-devel-bounces at ml.sbforge.org>



Hi Sara,

I forgot to mention that the problems were coming up with our daily 
crawls. The intention was to deploy 5.3.1, waiting for some daily crawls, 
before starting the broad crawl.

Thanks for your settings and for telling how your broad crawl will work!

Hi Andreas,

Are your problems coming up because you just launched a broad crawl?

At BnF, we are still running 5.3.0 with default settings on these 
parameters:
settings.harvester.harvesting.sendReadyInterval on 30s 
settings.harvester.harvesting.sendReadyDelay on 1000ms

We are currently testing 5.3.1 on very small crawls (working well)
and we will start bigger crawls next week. I'll let you know
how it goes.

Sara 




De :        <aponb at gmx.at>
A :        <netarchivesuite-devel at ml.sbforge.org>
Date :        28/06/2017 11:43
Objet :        [Netarchivesuite-devel] Multiple jobs submitted 
simultaneously under 5.3.1
Envoyé par :        Netarchivesuite-devel 
<netarchivesuite-devel-bounces at ml.sbforge.org>



If was running Nas Version on 5.3.1 in production and did get a huge 
number of jobs with the same Configurations submitted. This must be the 
behavior of https://sbforge.org/jira/browse/NAS-2614which was fixed for 
Version 5.3.1 - the strange thing is, that I had not any problems in 
Version 5.3.0.

Is anyone experiencing the same issue?
As suggested I set settings.harvester.harvesting.sendReadyInterval to 
300 and I am using settings.harvester.harvesting.sendReadyDelay with 
value 300

Also the HarvestJobManagerApplication dies with OutOfMemory Exception, 
even when started with parameter -Xmx4096m

20:28:11.823 ERROR d.n.c.lifecycle.PeriodicTaskExecutor - Task threw 
exception: java.lang.OutOfMemoryError: Java heap space
java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: 
Java heap space
        at java.util.concurrent.FutureTask.report(FutureTask.java:122) 
~[na:1.8.0_77]
        at java.util.concurrent.FutureTask.get(FutureTask.java:192) 
~[na:1.8.0_77]
        at 
dk.netarkivet.common.lifecycle.PeriodicTaskExecutor.checkExecution(PeriodicTaskExecutor.java:171) 

[common-core-5.3.1.jar:UNKNOWN_REVISION]
        at 
dk.netarkivet.common.lifecycle.PeriodicTaskExecutor.access$500(PeriodicTaskExecutor.java:47) 

[common-core-5.3.1.jar:UNKNOWN_REVISION]
        at 
dk.netarkivet.common.lifecycle.PeriodicTaskExecutor$1.run(PeriodicTaskExecutor.java:152) 

[common-core-5.3.1.jar:UNKNOWN_REVISION]

Do you have any thoughts on this?
Regards
a.

_______________________________________________
Netarchivesuite-devel mailing list
Netarchivesuite-devel at ml.sbforge.org
https://ml.sbforge.org/mailman/listinfo/netarchivesuite-devel


Expositions :
Le monde selon Topor- jusqu'au 16 juillet 2017 - BnF - François-Mitterrand
La bibliothèque, la nuit – Bibliothèques mythiques en réalité virtuelle - 
jusqu'au 13 août 2017 - BnF - François-Mitterrand
Avant d'imprimer, pensez à l'environnement.


_______________________________________________
Netarchivesuite-devel mailing list
Netarchivesuite-devel at ml.sbforge.org
https://ml.sbforge.org/mailman/listinfo/netarchivesuite-devel
_______________________________________________
Netarchivesuite-devel mailing list
Netarchivesuite-devel at ml.sbforge.org
https://ml.sbforge.org/mailman/listinfo/netarchivesuite-devel

Expositions :
Le monde selon Topor - jusqu'au 16 juillet 2017 - BnF - 
François-Mitterrand
La bibliothèque, la nuit – Bibliothèques mythiques en réalité virtuelle - 
jusqu'au 13 août 2017 - BnF - François-Mitterrand
Avant d'imprimer, pensez à l'environnement.
_______________________________________________
Netarchivesuite-devel mailing list
Netarchivesuite-devel at ml.sbforge.org
https://ml.sbforge.org/mailman/listinfo/netarchivesuite-devel


Exposition  Paysages français – Une aventure photographique (1984 - 2017)  - du 24 octobre 2017 au 4 février 2018 - BnF - François-Mitterrand Avant d'imprimer, pensez à l'environnement. 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://ml.sbforge.org/pipermail/netarchivesuite-devel/attachments/20171113/2a310c69/attachment-0001.html>


More information about the Netarchivesuite-devel mailing list