[Netarchivesuite-devel] Problem running new Batch Job
Colin Samuel Rosenthal
csr at statsbiblioteket.dk
Wed Jul 8 10:44:55 CEST 2009
Yes! I just ran the same job as an ordinary BatchJob from the class file and again it failed on the method.invoke() call, not on the
class loadClass() call. So the problem has nothing to do with LoadableJarBatchJob at all.
--
Colin
________________________________________
From: netarchivesuite-devel-bounces at lists.gforge.statsbiblioteket.dk [netarchivesuite-devel-bounces at lists.gforge.statsbiblioteket.dk] On Behalf Of Colin Samuel Rosenthal [csr at statsbiblioteket.dk]
Sent: Wednesday, July 08, 2009 10:30 AM
To: Søren Vejrup Carlsen; netarchivesuite-devel at lists.gforge.statsbiblioteket.dk
Cc: Asger Blekinge-Rasmussen
Subject: Re: [Netarchivesuite-devel] Problem running new Batch Job
Asger and I spent some time looking at this yesterday. We're not much wiser, but we're stupider in a more interesting way :-)
We replaced the body of the initialise() method with:
Class<?> urifactory = this.getClass().getClassLoader().loadClass(
"org.archive.net.UURIFactory");
Method method = urifactory.getMethod("getInstance", String.class);
Object o = method.invoke(null, "http://foo.bar");
(surrounded by a try/catch which rethrows the checked exceptions as RuntimeExceptions).
The interesting thing is that the NoClassDefFoundError is thrown by the last line - the method invocation. So
the class is loaded, but cannot be linked/instantiated/initialised.
I also tried printing the toString() of urifactory.getClassLoader() to see what actual ClassLoader was invoked but this produced a security exception.
The java ClassLoader delegation model means that the loadClass() method will first try to load the class from the parent of the current class-loader. So
urifactory is presumably loaded from the system class-loader, not the LoadableJarBatch loader. In fact it loads even if we remove heritrix.jar from
the RunBatch script. However that means that any classes loaded from UURIFactory will not have access to the LoadableJar loader. I don't see why
that should matter, unless we are missing a dependency in the bitarchive application. Is that possible? If so, then the problem has nothing to do
with LoadableJarBatchJob and should be reproducible in any batch job.
--
Colin
________________________________________
From: Søren Vejrup Carlsen [svc at kb.dk]
Sent: Tuesday, July 07, 2009 2:37 PM
To: Colin Samuel Rosenthal; netarchivesuite-devel at lists.gforge.statsbiblioteket.dk
Subject: SV: [Netarchivesuite-devel] Problem running new Batch Job
Hi Colin.
It would seem, that there is a bug in the LoadableJarBatchJob class.
To determine if this is the case, we need to add more logging in this class, specifically in the
LoadableJarBatchJob.ByteJarLoader#findClass() method to see, why it fails to look up the UURIFactory class.
/Søren
-----Oprindelig meddelelse-----
Fra: netarchivesuite-devel-bounces at lists.gforge.statsbiblioteket.dk [mailto:netarchivesuite-devel-bounces at lists.gforge.statsbiblioteket.dk] På vegne af Colin Rosenthal
Sendt: 7. juli 2009 10:41
Til: netarchivesuite-devel at lists.gforge.statsbiblioteket.dk
Emne: [Netarchivesuite-devel] Problem running new Batch Job
Hi,
I'm testing a new ArcBatchJob class and I don't understand why I'm
getting a NoClassDefFoundError for org.archive.net.UURIFactory. I've
tried both with and without specifically adding the heritrix jar as
batch dependency but it doesn't seem to make any difference. Here is
my command line and the output:
java -Ddk.netarkivet.settings.file=../settings_wayback_8080.xml
-Dsettings.common.applicationInstanceId=CDXIndexer8080 -cp
lib/dk.netarkivet.archive.jar dk.netarkivet.archive.tools.RunBatch
-Ndk.netarkivet.wayback.ExtractWaybackCDXBatchJob -R'.*'
-J/home/netarkiv/csr/batch/lib/dk.netarkivet.wayback.jar,/home/netarkiv/csr/batch/lib/wayback-core-1.4.0.jar,/home/netarkiv/csr/batch/lib/heritrix/lib/heritrix-1.14.3.jar
Jul 7, 2009 10:35:12 AM
dk.netarkivet.archive.arcrepository.distribute.JMSArcRepositoryClient <init>
INFO: JMSArcRepositoryClient will retry a store 3 times and timeout on
each try after 3600000 milliseconds, and timeout on each getrequest
after 60000 milliseconds.
Jul 7, 2009 10:35:12 AM
dk.netarkivet.common.distribute.JMSConnectionSunMQ <init>
INFO: Creating instance of
dk.netarkivet.common.distribute.JMSConnectionSunMQ
Jul 7, 2009 10:35:12 AM dk.netarkivet.common.distribute.JMSConnection
initConnection
INFO: Initializing a JMS connection of type 'class
dk.netarkivet.common.distribute.JMSConnectionSunMQ' to Broker at
kb-test-adm-001.kb.dk:7676.
Jul 7, 2009 10:35:13 AM
dk.netarkivet.archive.arcrepository.distribute.JMSArcRepositoryClient <init>
INFO: JMSArcRepository listens for replies on channel '[Queue
'PLIGT_COMMON_THIS_REPOS_CLIENT_130_225_27_142_NA_CDXINDEXER8080']'
Jul 7, 2009 10:35:13 AM
dk.netarkivet.common.utils.batch.LoadableJarBatchJob <init>
INFO: Loading loadableJarBatchJob using jarfiles:
dk.netarkivet.wayback.jarwayback-core-1.4.0.jarheritrix-1.14.3.jar and
jobclass 'dk.netarkivet.wayback.ExtractWaybackCDXBatchJob
Running batch job 'dk.netarkivet.wayback.ExtractWaybackCDXBatchJob from
jar-file
'/home/netarkiv/csr/batch/lib/dk.netarkivet.wayback.jar,/home/netarkiv/csr/batch/lib/wayback-core-1.4.0.jar,/home/netarkiv/csr/batch/lib/heritrix/lib/heritrix-1.14.3.jar'
on files matching '.*' on replica 'SBN', output written to stdout errors
written to stderr
Jul 7, 2009 10:35:15 AM
dk.netarkivet.archive.arcrepository.distribute.JMSArcRepositoryClient batch
WARNING: The batch job
'ID:15-130.225.27.142(ff:40:86:d7:73:db)-55674-1246955713422: To
PLIGT_COMMON_THE_REPOS ReplyTo
PLIGT_COMMON_THIS_REPOS_CLIENT_130_225_27_142_NA_CDXINDEXER8080 OK Job:
dk.netarkivet.common.utils.batch.LoadableJarBatchJob processing
dk.netarkivet.wayback.ExtractWaybackCDXBatchJob from
dk.netarkivet.common.utils.batch.LoadableJarBatchJob$ByteJarLoader at ba6c83'
resulted in the following error: java.lang.NoClassDefFoundError: Could
not initialize class org.archive.net.UURIFactory
java.lang.NoClassDefFoundError: Could not initialize class
org.archive.net.UURIFactory
at
org.archive.wayback.util.url.AggressiveUrlCanonicalizer.urlStringToKey(AggressiveUrlCanonicalizer.java:234)
at
org.archive.wayback.resourcestore.indexer.ARCRecordToSearchResultAdapter.adaptInner(ARCRecordToSearchResultAdapter.java:140)
at
org.archive.wayback.resourcestore.indexer.ARCRecordToSearchResultAdapter.adapt(ARCRecordToSearchResultAdapter.java:65)
at
dk.netarkivet.wayback.ExtractWaybackCDXBatchJob.processRecord(ExtractWaybackCDXBatchJob.java:64)
at
dk.netarkivet.common.utils.arc.ARCBatchJob.processFile(ARCBatchJob.java:142)
at
dk.netarkivet.common.utils.batch.LoadableJarBatchJob.processFile(LoadableJarBatchJob.java:213)
at
dk.netarkivet.common.utils.batch.BatchLocalFiles.processFile(BatchLocalFiles.java:115)
at
dk.netarkivet.common.utils.batch.BatchLocalFiles.run(BatchLocalFiles.java:82)
at
dk.netarkivet.archive.bitarchive.Bitarchive.batch(Bitarchive.java:240)
at
dk.netarkivet.archive.bitarchive.distribute.BitarchiveServer$1.run(BitarchiveServer.java:396)
Processed 0 files with 0 failures
Cleaning up dk.netarkivet.common.distribute.JMSConnectionSunMQ
Cleaned up dk.netarkivet.common.distribute.JMSConnectionSunMQ
--
Colin
_______________________________________________
Netarchivesuite-devel mailing list
Netarchivesuite-devel at lists.gforge.statsbiblioteket.dk
https://lists.gforge.statsbiblioteket.dk/mailman/listinfo/netarchivesuite-devel
_______________________________________________
Netarchivesuite-devel mailing list
Netarchivesuite-devel at lists.gforge.statsbiblioteket.dk
https://lists.gforge.statsbiblioteket.dk/mailman/listinfo/netarchivesuite-devel
More information about the Netarchivesuite-devel
mailing list