[Netarchivesuite-users] Problems With The IndexServerAplication + WaybackIndexer DB

Mikis Seth Sørensen mss at statsbiblioteket.dk
Thu Dec 11 16:12:59 CET 2014


Hi Charless

I don’t think this is a deployment file that causes the proplem this this I’ve compared the standard deploy_standalone_example.xml file we test in the quickstart with the deploy_standalone_example_with_wayback_apps.xml you have based the deployment on, and the only differences in the non-wayback part was the 2 problems you have encountered.

I’ll ask one of the other guys here if he has a idea, and get back to you.

Best
Mikis

From: Charles Tassell <ctassell at gmail.com<mailto:ctassell at gmail.com>>
Reply-To: "netarchivesuite-users at ml.sbforge.org<mailto:netarchivesuite-users at ml.sbforge.org>" <netarchivesuite-users at ml.sbforge.org<mailto:netarchivesuite-users at ml.sbforge.org>>
Date: Wednesday, December 10, 2014 at 3:54 PM
To: "netarchivesuite-users at ml.sbforge.org<mailto:netarchivesuite-users at ml.sbforge.org>" <netarchivesuite-users at ml.sbforge.org<mailto:netarchivesuite-users at ml.sbforge.org>>
Subject: Re: [Netarchivesuite-users] Problems With The IndexServerAplication + WaybackIndexer DB

Thanks, I fixed that up and ran a job, but after the crawl finished and the .warc was created th next step seemed to die.  the BitarchiveMonitorApplication0.log file says that it can't find the .warc file (although I have confirmed that it's there)  The log says:

10-Dec-2014 11:14:24 AM dk.netarkivet.archive.bitarchive.distribute.BitarchiveMonitorServer
replyToGetChecksumMessage
INFO: Replying GetChecksumMessage: 'ID:1795-137.149.200.20(a0:54:f2:7b:7:3c)-38167-1418224464362:
To ROBLIB_COMMON_THE_REPOS ReplyTo ROBLIB_COMMON_THIS_REPOS_CLIENT_137_149_200_20_GUIWS
Error: dk.netarkivet.common.exceptions.IOFailure: The batchjob did not

find the file '1-1-20141210141254-00000-webarchive.upei.ca.warc' within
the archive.
dk.netarkivet.common.exceptions.IOFailure: The batchjob did not find the
file '1-1-20141210141254-00000-webarchive.upei.ca.warc' within the archive.
    at dk.netarkivet.archive.bitarchive.distribute.BitarchiveMonitorServer.replyToGetChecksumMessage(Bi
tarchiveMonitorServer.java:733)
    at dk.netarkivet.archive.bitarchive.distribute.BitarchiveMonitorServer.replyConvertedBatch(Bitarchi
veMonitorServer.java:641)
    at dk.netarkivet.archive.bitarchive.distribute.BitarchiveMonitorServer.access$200(BitarchiveMonitor
Server.java:81)
    at dk.netarkivet.archive.bitarchive.distribute.BitarchiveMonitorServer$2.run(BitarchiveMonitorServe
r.java:535)
 Arcfiles: 1-1-20141210141254-00000-webarchive.upei.ca.warc, ReplicaId:
A, Checksum: null'.

But the file does exist: ./harvester_high/1_1418220769306/warcs/1-1-20141210141254-00000-webarchive.upei.ca.warc

  Is this another broken path in the deployment file?  Is there a better deployment file that I can use which installs the full suite (harvester, indexer and viewer) that is known to work?


On 14-12-10 10:07 AM, Mikis Seth Sørensen wrote:
Hi Charles

The application classes are defined in the deployment xml file. I can see that in the 'deploy_standalone_example_with_wayback_apps.xml’ the IndexServerApplication namespace is wrong missing the harvester part as you have note (the deploy_standalone_example.xml has the correct setting).

Try changing the line
<applicationNamename="dk.netarkivet.archive.indexserver.IndexServerApplication">
to
<applicationNamename="dk.netarkivet.harvester.indexserver.IndexServerApplication”>
in you deply xml and run the script generation and deployment again.

Best
Mikis

From: Charles Tassell <charles at islandadmin.ca<mailto:charles at islandadmin.ca>>
Reply-To: "netarchivesuite-users at ml.sbforge.org<mailto:netarchivesuite-users at ml.sbforge.org>" <netarchivesuite-users at ml.sbforge.org<mailto:netarchivesuite-users at ml.sbforge.org>>
Date: Wednesday, December 10, 2014 at 2:23 PM
To: "netarchivesuite-users at ml.sbforge.org<mailto:netarchivesuite-users at ml.sbforge.org>" <netarchivesuite-users at ml.sbforge.org<mailto:netarchivesuite-users at ml.sbforge.org>>
Subject: Re: [Netarchivesuite-users] Problems With The IndexServerAplication + WaybackIndexer DB

Sorry, did some grepping and found the comments in the deployment file for how to create the Wayback database, so that is sorted out.  I'm still wondering about the IndexServerApplication path though.

On 14-12-10 09:06 AM, Charles Tassell wrote:
Hi Guys,

  I'm still having some issues with getting a fresh 4.4.1 install going.  There seem to be two issues left after fixing the queue names in the deployment file.

  First, when I try to start the IndexServerApplication I get the following error message:

Exception in thread "main" java.lang.NoClassDefFoundError: dk/netarkivet/archive/indexserver/IndexServerApplication
Caused by: java.lang.ClassNotFoundException: dk.netarkivet.archive.indexserver.IndexServerApplication
        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
Could not find the main class: dk.netarkivet.archive.indexserver.IndexServerApplication.  Program will exit.

I did some digging, and it looks like the actual class path is dk.netarkivet.harvester.indexserver.IndexServerApplication  Is that correct, or are the harvester and archive IndexServerApplications different classes?

  Secondly, the WaybackIndexer does not seem to be able to connect to the database at port 8124.  It looks like the installer script doesn't create the derby instance for the WaybackIndexer.  Are there any docs on how to do that manually?





_______________________________________________
NetarchiveSuite-users mailing list
NetarchiveSuite-users at ml.sbforge.org<mailto:NetarchiveSuite-users at ml.sbforge.org>http://ml.sbforge.org/mailman/listinfo/netarchivesuite-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://ml.sbforge.org/pipermail/netarchivesuite-users/attachments/20141211/132630cc/attachment.html>


More information about the NetarchiveSuite-users mailing list