[Netarchivesuite-users] Error Uploading metadata-File

aponb at gmx.at aponb at gmx.at
Thu Jan 15 12:22:42 CET 2009


>
> Hi A.
> The most likely reason is, that you already have a file named "1-metadata-1.arc" in your archive.
> If you want to start from scratch with an empty database, you also need to reset your archive.
> Here two things are needed: 
> 1) Move or remove all files from your archive  
> 2) Reset the database over stored files in your archive. This is done by deleting the admin.data file administered by the ArcRepositoryApplication
>
>
> I hope that this helps.
>
> /Søren
>   
Hi Søren!

Thanks for your quick reply. This test has been done with a clean 
database - so there are no files in my archive.
I made a selective crawl on that special seed ( http://www.bg-bab.ac.at 
) and the metadata file for that crawl could be uploaded although the 
same error orccured. I expected that the upload would also fail. That is 
really strange.

Jan 15, 2009 11:31:47 AM 
dk.netarkivet.harvester.harvesting.HarvestController uploadFiles
INFO: Uploading file '3-metadata-1.arc' to arcrepository.
Jan 15, 2009 11:31:47 AM 
dk.netarkivet.archive.arcrepository.distribute.JMSArcRepositoryClient store
FINE: Sending a StoreMessage with file 
'/home/netarchive/apps/netarchivesuite/ONB/harvester_7055/3_1232015155105/metadata/3-metadata-1.arc'
Jan 15, 2009 11:31:47 AM 
dk.netarkivet.common.distribute.HTTPRemoteFileRegistry registerFile
FINE: Registered file 
'/home/netarchive/apps/netarchivesuite/ONB/harvester_7055/3_1232015155105/metadata/3-metadata-1.arc' 
with URL 'http://webcrawler06.onb.ac.at:8304/d7c0bb67'
Jan 15, 2009 11:31:47 AM 
dk.netarkivet.common.distribute.HTTPRemoteFileRegistry$HTTPRemoteFileRegistryHandler 
handle
FINE: Served file 
'/home/netarchive/apps/netarchivesuite/ONB/harvester_7055/3_1232015155105/metadata/3-metadata-1.arc' 
with URL 'http://webcrawler06.onb.ac.at:8304/d7c0bb67'
Jan 15, 2009 11:31:47 AM dk.netarkivet.common.distribute.Synchronizer 
sendAndWaitForOneReply
FINE: Received reply for message: 
ID:530-127.0.0.1(eb:15:48:29:10:14)-43075-1232015507284: To 
ONB_COMMON_THE_ARCREPOS ReplyTo ONB_COMMON_THIS_HACO_127_0_0_1_7055 OK 
Arcfile: 3-metadata-1.arc
Jan 15, 2009 11:31:47 AM 
dk.netarkivet.harvester.harvesting.HeritrixDomainHarvestReport parseCrawlLog
FINE: Invalid line in 
'/home/netarchive/apps/netarchivesuite/ONB/harvester_7055/3_1232015155105/logs/crawl.log' 
line 13: '2009-01-15T10:26:01.872Z    -7          - invalid:https:/ EX 
http://www.bg-bab.ac.at/menu.files/dmenu.js no-type #042 - - - -'. Ignoring.
dk.netarkivet.common.exceptions.IOFailure: Unparsable URI in field 4 of 
crawl.log: 'invalid:https:/'.
        at 
dk.netarkivet.harvester.harvesting.HeritrixDomainHarvestReport.processHarvestLine(HeritrixDomainHarvestReport.java:158)
        at 
dk.netarkivet.harvester.harvesting.HeritrixDomainHarvestReport.parseCrawlLog(HeritrixDomainHarvestReport.java:107)
        at 
dk.netarkivet.harvester.harvesting.HeritrixDomainHarvestReport.<init>(HeritrixDomainHarvestReport.java:86)
        at 
dk.netarkivet.harvester.harvesting.HarvestController.generateHeritrixDomainHarvestReport(HarvestController.java:294)
        at 
dk.netarkivet.harvester.harvesting.HarvestController.storeFiles(HarvestController.java:268)
        at 
dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer.processHarvestInfoFile(HarvestControllerServer.java:550)
        at 
dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer.access$300(HarvestControllerServer.java:83)
        at 
dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer$HarvesterThread.run(HarvestControllerServer.java:647)
Jan 15, 2009 11:31:47 AM 
dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer 
processHarvestInfoFile
INFO: Done post-processing files for job 3 in dir: 
'/home/netarchive/apps/netarchivesuite/ONB/harvester_7055/3_1232015155105'
Jan 15, 2009 11:31:47 AM 
dk.netarkivet.harvester.harvesting.distribute.HarvestControllerServer$HarvesterThread 
run
INFO: Ending crawl of job : 3




More information about the NetarchiveSuite-users mailing list