[Netarchivesuite-users] Nothing happens after starting generating dedupcrawllogindex
aponb at gmx.at
aponb at gmx.at
Tue May 26 11:48:00 CEST 2009
Hi Jonas!
I changed my configuration now to the settings I attached. I changed to
FTPRemoteFile class and did provide for each HarvesterApplication a
different remotefile port number.
Now the system is not able to start the crawler:
FINE: Successfully received an index of type 'DEDUP_CRAWL_LOG' for the
jobs [82]
May 19, 2009 4:57:08 PM dk.netarkivet.common.distribute.FTPRemoteFile logOn
FINE: Logged onto ftp://netarchive:**********@wc05:21
May 19, 2009 4:57:08 PM dk.netarkivet.common.distribute.FTPRemoteFile
cleanup
FINE: Deleting file 'segments.gz-54974-1242745028067' from ftp server
May 19, 2009 4:57:08 PM dk.netarkivet.common.distribute.FTPRemoteFile logOn
FINE: Logged onto ftp://netarchive:**********@wc05:21
May 19, 2009 4:57:08 PM dk.netarkivet.common.distribute.FTPRemoteFile
cleanup
FINE: Deleting file 'segments.gz-54974-1242745028067' from ftp server
May 19, 2009 4:57:08 PM dk.netarkivet.common.distribute.FTPRemoteFile logOn
FINE: Logged onto ftp://netarchive:**********@wc05:21
May 19, 2009 4:57:08 PM dk.netarkivet.archive.indexserver.FileBasedCache
cache
FINE: release lock on filechannel sun.nio.ch.FileChannelImpl at 1ef7de4
May 19, 2009 4:57:08 PM dk.netarkivet.archive.indexserver.FileBasedCache
getIndex
INFO: Generated index
'/home/netarchive/data/netarchivesuite/cache/DEDUP_CRAWL_LOG/82-cache'
of id '[82]', request was for '[82]'
May 19, 2009 4:57:08 PM dk.netarkivet.harvester.harvesting.HeritrixFiles
setIndexDir
FINE: Setting deduplication index dir
'/home/netarchive/data/netarchivesuite/cache/DEDUP_CRAWL_LOG/82-cache'
May 19, 2009 4:57:08 PM
dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer$HarvesterThread
run
INFO: Starting crawl of job : 83
May 19, 2009 4:57:08 PM dk.netarkivet.harvester.harvesting.HeritrixFiles
writeOrderXml
FINE: Writing order-file to disk as file:
/home/netarchive/apps/netarchivesuite/ONB/8803harvester/83_1242745027376/order.xml
May 19, 2009 4:57:08 PM
dk.netarkivet.harvester.harvesting.JMXHeritrixController <init>
INFO: Starting Heritrix for job 83 of harvest 27 in
8803harvester/83_1242745027376
May 19, 2009 4:57:08 PM
dk.netarkivet.harvester.harvesting.JMXHeritrixController getJMXAdminName
FINE: The JMX username used for connecting to the Heritrix GUI is: 'admin'.
May 19, 2009 4:57:09 PM dk.netarkivet.common.utils.JMXUtils executeCommand
FINE: Preparing to execute completedJobs with args [] on
org.archive.crawler:name=Heritrix,type=CrawlService,jmxport=7003,guiport=8803,host=webcrawler06.onb.ac.at
May 19, 2009 4:59:20 PM
dk.netarkivet.harvester.harvesting.JMXHeritrixController getJMXAdminName
FINE: The JMX username used for connecting to the Heritrix GUI is: 'admin'.
May 19, 2009 4:59:20 PM dk.netarkivet.common.utils.JMXUtils executeCommand
FINE: Preparing to execute shutdown with args [] on
org.archive.crawler:name=Heritrix,type=CrawlService,jmxport=7003,guiport=8803,host=webcrawler06.onb.ac.at
May 19, 2009 5:01:31 PM
dk.netarkivet.harvester.harvesting.JMXHeritrixController cleanup
SEVERE: JMX error while cleaning up Heritrix controller
dk.netarkivet.common.exceptions.IOFailure: Failed to find MBean
org.archive.crawler:name=Heritrix,type=CrawlService,jmxport=7003,guiport=8803,host=webcrawler06.onb.ac.at
for executing shutdown after 17 attempts
at
dk.netarkivet.common.utils.JMXUtils.executeCommand(JMXUtils.java:262)
at
dk.netarkivet.common.utils.JMXUtils.executeCommand(JMXUtils.java:426)
at
dk.netarkivet.harvester.harvesting.JMXHeritrixController.executeHeritrixCommand(JMXHeritrixController.java:852)
at
dk.netarkivet.harvester.harvesting.JMXHeritrixController.cleanup(JMXHeritrixController.java:505)
at
dk.netarkivet.harvester.harvesting.HeritrixLauncher.doCrawl(HeritrixLauncher.java:200)
at
dk.netarkivet.harvester.harvesting.HarvestController.runHarvest(HarvestController.java:221)
at
dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer$HarvesterThread.run(HarvestControllerServer.java:650)
Caused by: javax.management.InstanceNotFoundException:
org.archive.crawler:name=Heritrix,type=CrawlService,jmxport=7003,guiport=8803,host=webcrawler06.onb.ac.at
at
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1094)
at
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getClassLoaderFor(DefaultMBeanServerInterceptor.java:1438)
at
com.sun.jmx.mbeanserver.JmxMBeanServer.getClassLoaderFor(JmxMBeanServer.java:1276)
at
com.sun.jmx.remote.security.MBeanServerAccessController.getClassLoaderFor(MBeanServerAccessController.java:313)
at
javax.management.remote.rmi.RMIConnectionImpl$5.run(RMIConnectionImpl.java:1325)
at java.security.AccessController.doPrivileged(Native Method)
at
javax.management.remote.rmi.RMIConnectionImpl.getClassLoaderFor(RMIConnectionImpl.java:1322)
at
javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:771)
at sun.reflect.GeneratedMethodAccessor2.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305)
at sun.rmi.transport.Transport$1.run(Transport.java:159)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:155)
at
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535)
at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790)
at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:885)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
at java.lang.Thread.run(Thread.java:619)
at
sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:255)
at
sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:233)
at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:142)
at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source)
at
javax.management.remote.rmi.RMIConnectionImpl_Stub.invoke(Unknown Source)
at
javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.invoke(RMIConnector.java:978)
at
dk.netarkivet.common.utils.JMXUtils.executeCommand(JMXUtils.java:243)
... 6 more
Could you please have another look on my settings. Maybe you can find
another mistake!
Thx
a.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: deploy_config.xml
Type: text/xml
Size: 14913 bytes
Desc: not available
URL: <http://ml.sbforge.org/pipermail/netarchivesuite-users/attachments/20090526/7b8fbb4a/attachment-0002.xml>
More information about the NetarchiveSuite-users
mailing list