[Netarchivesuite-users] Uploading files

Bjarne Andersen bja at statsbiblioteket.dk
Mon Oct 12 13:32:15 CEST 2009


This seems to me to be that the file was priviously uploaded (perhaps not all the way) so the admin.data knows about the file - but the file is still on the harvester and has now changed (e.g. a bit has flipped).

You should
1. Check your archive (do you have multiple locations ?) - is the file there ?
2. If "yes" to (1) - what is the checksum of the file in the archive ?
3. Check admin.data - what does it know about the file (grep [filename] admin.data)
4. If its the case that the archive already have the file and the archive-file is different from the file on the harvester (It seems to be the case) you need to decide.
  - A) Is the archive-file or the temp-file on the harvester the right one
  - A1) if you decide that the archive-file is the right one you could delete the harvester-file - maybe copy the archive-version back to the harvester and do another upload with the command-line tool to ensure that your correct file is at all locations (if you use multiple locations)
  - A2) if you decide that the harvester-version is the correct one (if the upload for some reason failed during file-transfer) you need to do some cleaning
  - A2-1) delete the file from the archive
  - A2-2) I think this should work: run bit-preservation filestatus (do you use the archive-module of NetarchiveSuite) - the file should be found as missing - you can then with the interface mark it as FAILED - and your upload from command-line should work
  - A2-3) You could also (but not initially recommended): Stop the ArcRepository Application - edit admin.data and delete all lines containing the filename - be careful with this one - e.g. make a backup copy of the file before editing. This method is the only one if A2-2) does not work.

I hope this will help you. You have reached a very rare situation where an ARC-file on disk have changed - either in your archive or on your harvester machine. A good example of why you need multiple copies of files archived as soon as possible after the harvest so that exactly this does not happen.

best
Bjarne Andersen

> -----Original Message-----
> From: netarchivesuite-users-
> bounces at lists.gforge.statsbiblioteket.dk [mailto:netarchivesuite-
> users-bounces at lists.gforge.statsbiblioteket.dk] On Behalf Of
> aponb at gmx.at
> Sent: Monday, October 12, 2009 9:19 AM
> To: netarchivesuite-users at lists.gforge.statsbiblioteket.dk
> Subject: [Netarchivesuite-users] Uploading files
>
> >
> > Hi
> >
> > There is absolutely no problem in using the settings file for a
> harvester.
> >
> > I cannot see from these error messages, why the upload has
> failed.
> > It might say more in the log files for the ArcRepository.
> > Could you send the one which contains the entries for the time of
> your upload attempt?
> >
> > The problem could also be in the configuration file, so could you
> also send it?
> >
> >
> > Best Regards
> > Jonas
> >
> >
> Hi Jonas!
>
> Thanks for your reply.
>
> Enclosed the log file entry for the ArcRepository. There is warning
> about a wrong checksum.
> What can I do now?
>
> Regards
> a.
>
> Oct 12, 2009 8:43:47 AM
> dk.netarkivet.common.distribute.JMSConnection reply
> INFO: Sending message to destination
> 'ONB_COMMON_THIS_REPOS_CLIENT_172_16_14_154_HCS_8800HARVESTER', ID
> =
> ID:151111-172.16.14.154(81:ac:ef:7f:3b:65)-38722-1255330123199
> Oct 12, 2009 9:08:52 AM
> dk.netarkivet.archive.arcrepository.ArcRepository store
> INFO: Store started:
> '2832-10-20090929075126-00033-webcrawler06.onb.ac.at.arc.gz'
> Oct 12, 2009 9:08:52 AM
> dk.netarkivet.archive.arcrepository.ArcRepository store
> FINE: Retrying store of already known file
> '2832-10-20090929075126-00033-webcrawler06.onb.ac.at.arc.gz',
> Already
> completed: false
> Oct 12, 2009 9:08:52 AM
> dk.netarkivet.archive.arcrepository.ArcRepository startUpload
> FINE: Upload started of file
> '2832-10-20090929075126-00033-webcrawler06.onb.ac.at.arc.gz' at
> 'ONB_ONB_THE_BAMON'
> Oct 12, 2009 9:08:52 AM
> dk.netarkivet.archive.arcrepositoryadmin.UpdateableAdminData write
> FINE: appending entry for filename
> '2832-10-20090929075126-00033-webcrawler06.onb.ac.at.arc.gz' to
> admin.data
> Oct 12, 2009 9:08:52 AM
> dk.netarkivet.archive.arcrepository.ArcRepository sendChecksumJob
> FINE: Checksum job submitted for:
> '2832-10-20090929075126-00033-webcrawler06.onb.ac.at.arc.gz'
> Oct 12, 2009 9:09:09 AM
> dk.netarkivet.archive.arcrepository.ArcRepository onBatchReply
> FINE: BatchReplyMessage received: '
> BatchReplyMessage for batch job
> ID:197812-127.0.1.1(ce:4c:8a:88:6d:e7)-59124-1255331332265
> FilesProcessed = 1
> FilesFailed = 0
> ID:4603179-127.0.1.1(87:17:c2:cd:53:15)-59114-1255331349965: To
> ONB_COMMON_THE_REPOS ReplyTo ONB_ONB_THE_BAMON OK'
> Oct 12, 2009 9:09:10 AM
> dk.netarkivet.common.distribute.FTPRemoteFile logOn
> FINE: Logged onto ftp://netarchive:**********@wc01:21
> Oct 12, 2009 9:09:10 AM
> dk.netarkivet.common.distribute.FTPRemoteFile
> cleanup
> FINE: Deleting file
> 'ID:4603138-127.0.1.1(dc:3d:29:9c:f1:72)-59123-
> 1255331332267745149669755162804batch_aggregation-30858-
> 1255331349895'
> from ftp server
> Oct 12, 2009 9:09:10 AM
> dk.netarkivet.common.distribute.FTPRemoteFile logOn
> FINE: Logged onto ftp://netarchive:**********@wc01:21
> Oct 12, 2009 9:09:10 AM
> dk.netarkivet.common.distribute.FTPRemoteFile
> cleanup
> FINE: Deleting file
> 'ID:4603138-127.0.1.1(dc:3d:29:9c:f1:72)-59123-
> 1255331332267745149669755162804batch_aggregation-30858-
> 1255331349895'
> from ftp server
> Oct 12, 2009 9:09:10 AM
> dk.netarkivet.common.distribute.FTPRemoteFile logOn
> FINE: Logged onto ftp://netarchive:**********@wc01:21
> Oct 12, 2009 9:09:10 AM
> dk.netarkivet.archive.arcrepository.ArcRepository processCheckSum
> FINE: Checksum received ... processing
> Oct 12, 2009 9:09:10 AM
> dk.netarkivet.archive.arcrepository.ArcRepository processCheckSum
> WARNING: Cannot upload (wrong checksum)
> '2832-10-20090929075126-00033-webcrawler06.onb.ac.at.arc.gz' to
> 'ONB_ONB_THE_BAMON', reported
> checksum='e9c6ce2485ace6f526ac065a0c86efd0'
> Oct 12, 2009 9:09:10 AM
> dk.netarkivet.archive.arcrepositoryadmin.UpdateableAdminData write
> FINE: appending entry for filename
> '2832-10-20090929075126-00033-webcrawler06.onb.ac.at.arc.gz' to
> admin.data
> Oct 12, 2009 9:09:10 AM
> dk.netarkivet.archive.arcrepository.ArcRepository replyNotOK
> WARNING: Store NOT OK:
> '2832-10-20090929075126-00033-webcrawler06.onb.ac.at.arc.gz'
> Oct 12, 2009 9:09:10 AM
> dk.netarkivet.archive.arcrepository.ArcRepository replyNotOK
> FINE: Sending store NOT OK reply to message
> 'ID:15-172.16.14.149(89:7:bd:29:3a:2d)-43376-1255331124505: To
> ONB_COMMON_THE_REPOS ReplyTo
> ONB_COMMON_THIS_REPOS_CLIENT_172_16_14_149_HCA_8801HARVESTER Error:
> Failure while trying to store ARC file:
> 2832-10-20090929075126-00033-webcrawler06.onb.ac.at.arc.gz Arcfile:
> 2832-10-20090929075126-00033-webcrawler06.onb.ac.at.arc.gz'
> Oct 12, 2009 9:09:10 AM
> dk.netarkivet.common.distribute.JMSConnection reply
> INFO: Sending message to destination
> 'ONB_COMMON_THIS_REPOS_CLIENT_172_16_14_149_HCA_8801HARVESTER', ID
> =
> ID:15-172.16.14.149(89:7:bd:29:3a:2d)-43376-1255331124505
> Oct 12, 2009 9:09:18 AM
> dk.netarkivet.archive.arcrepository.ArcRepository store
> INFO: Store started:
> '2832-10-20090929075126-00033-webcrawler06.onb.ac.at.arc.gz'
> Oct 12, 2009 9:09:18 AM
> dk.netarkivet.archive.arcrepository.ArcRepository store
> FINE: Retrying store of already known file
> '2832-10-20090929075126-00033-webcrawler06.onb.ac.at.arc.gz',
> Already
> completed: false
> Oct 12, 2009 9:09:18 AM
> dk.netarkivet.archive.arcrepository.ArcRepository startUpload
> FINE: Upload started of file
> '2832-10-20090929075126-00033-webcrawler06.onb.ac.at.arc.gz' at
> 'ONB_ONB_THE_BAMON'
> Oct 12, 2009 9:09:18 AM
> dk.netarkivet.archive.arcrepositoryadmin.UpdateableAdminData write
> FINE: appending entry for filename
> '2832-10-20090929075126-00033-webcrawler06.onb.ac.at.arc.gz' to
> admin.data
> Oct 12, 2009 9:09:18 AM
> dk.netarkivet.archive.arcrepository.ArcRepository sendChecksumJob
> FINE: Checksum job submitted for:
> '2832-10-20090929075126-00033-webcrawler06.onb.ac.at.arc.gz'
> Oct 12, 2009 9:09:19 AM
> dk.netarkivet.archive.arcrepository.ArcRepository onBatchReply
> FINE: BatchReplyMessage received: '
> BatchReplyMessage for batch job
> ID:197819-127.0.1.1(ce:4c:8a:88:6d:e7)-59124-1255331358797
> FilesProcessed = 1
> FilesFailed = 0
> ID:4603205-127.0.1.1(87:17:c2:cd:53:15)-59114-1255331359905: To
> ONB_COMMON_THE_REPOS ReplyTo ONB_ONB_THE_BAMON OK'
> Oct 12, 2009 9:09:19 AM
> dk.netarkivet.common.distribute.FTPRemoteFile logOn
> FINE: Logged onto ftp://netarchive:**********@wc01:21
> Oct 12, 2009 9:09:19 AM
> dk.netarkivet.common.distribute.FTPRemoteFile
> cleanup
> FINE: Deleting file
> 'ID:4603198-127.0.1.1(dc:3d:29:9c:f1:72)-59123-
> 12553313587992629709852397020183batch_aggregation-15063-
> 1255331359847'
> from ftp server
> Oct 12, 2009 9:09:19 AM
> dk.netarkivet.common.distribute.FTPRemoteFile logOn
> FINE: Logged onto ftp://netarchive:**********@wc01:21
> Oct 12, 2009 9:09:19 AM
> dk.netarkivet.common.distribute.FTPRemoteFile
> cleanup
> FINE: Deleting file
> 'ID:4603198-127.0.1.1(dc:3d:29:9c:f1:72)-59123-
> 12553313587992629709852397020183batch_aggregation-15063-
> 1255331359847'
> from ftp server
> Oct 12, 2009 9:09:24 AM
> dk.netarkivet.common.distribute.FTPRemoteFile logOn
> FINE: Logged onto ftp://netarchive:**********@wc01:21
> Oct 12, 2009 9:09:24 AM
> dk.netarkivet.archive.arcrepository.ArcRepository processCheckSum
> FINE: Checksum received ... processing
> Oct 12, 2009 9:09:24 AM
> dk.netarkivet.archive.arcrepository.ArcRepository processCheckSum
> WARNING: Cannot upload (wrong checksum)
> '2832-10-20090929075126-00033-webcrawler06.onb.ac.at.arc.gz' to
> 'ONB_ONB_THE_BAMON', reported
> checksum='e9c6ce2485ace6f526ac065a0c86efd0'
> Oct 12, 2009 9:09:24 AM
> dk.netarkivet.archive.arcrepositoryadmin.UpdateableAdminData write
> FINE: appending entry for filename
> '2832-10-20090929075126-00033-webcrawler06.onb.ac.at.arc.gz' to
> admin.data
> Oct 12, 2009 9:09:24 AM
> dk.netarkivet.archive.arcrepository.ArcRepository replyNotOK
> WARNING: Store NOT OK:
> '2832-10-20090929075126-00033-webcrawler06.onb.ac.at.arc.gz'
> Oct 12, 2009 9:09:24 AM
> dk.netarkivet.archive.arcrepository.ArcRepository replyNotOK
> FINE: Sending store NOT OK reply to message
> 'ID:18-172.16.14.149(89:7:bd:29:3a:2d)-43376-1255331151037: To
> ONB_COMMON_THE_REPOS ReplyTo
> ONB_COMMON_THIS_REPOS_CLIENT_172_16_14_149_HCA_8801HARVESTER Error:
> Failure while trying to store ARC file:
> 2832-10-20090929075126-00033-webcrawler06.onb.ac.at.arc.gz Arcfile:
> 2832-10-20090929075126-00033-webcrawler06.onb.ac.at.arc.gz'
> Oct 12, 2009 9:09:24 AM
> dk.netarkivet.common.distribute.JMSConnection reply
> INFO: Sending message to destination
> 'ONB_COMMON_THIS_REPOS_CLIENT_172_16_14_149_HCA_8801HARVESTER', ID
> =
> ID:18-172.16.14.149(89:7:bd:29:3a:2d)-43376-1255331151037
> Oct 12, 2009 9:09:33 AM
> dk.netarkivet.archive.arcrepository.ArcRepository store
> INFO: Store started:
> '2832-10-20090929075126-00033-webcrawler06.onb.ac.at.arc.gz'
> Oct 12, 2009 9:09:33 AM
> dk.netarkivet.archive.arcrepository.ArcRepository store
> FINE: Retrying store of already known file
> '2832-10-20090929075126-00033-webcrawler06.onb.ac.at.arc.gz',
> Already
> completed: false
> Oct 12, 2009 9:09:33 AM
> dk.netarkivet.archive.arcrepository.ArcRepository startUpload
> FINE: Upload started of file
> '2832-10-20090929075126-00033-webcrawler06.onb.ac.at.arc.gz' at
> 'ONB_ONB_THE_BAMON'
> Oct 12, 2009 9:09:33 AM
> dk.netarkivet.archive.arcrepositoryadmin.UpdateableAdminData write
> FINE: appending entry for filename
> '2832-10-20090929075126-00033-webcrawler06.onb.ac.at.arc.gz' to
> admin.data
> Oct 12, 2009 9:09:33 AM
> dk.netarkivet.archive.arcrepository.ArcRepository sendChecksumJob
> FINE: Checksum job submitted for:
> '2832-10-20090929075126-00033-webcrawler06.onb.ac.at.arc.gz'
> Oct 12, 2009 9:09:39 AM
> dk.netarkivet.archive.arcrepository.ArcRepository onBatchReply
> FINE: BatchReplyMessage received: '
> BatchReplyMessage for batch job
> ID:197826-127.0.1.1(ce:4c:8a:88:6d:e7)-59124-1255331373628
> FilesProcessed = 1
> FilesFailed = 0
> ID:4603253-127.0.1.1(87:17:c2:cd:53:15)-59114-1255331379685: To
> ONB_COMMON_THE_REPOS ReplyTo ONB_ONB_THE_BAMON OK'
> Oct 12, 2009 9:09:39 AM
> dk.netarkivet.common.distribute.FTPRemoteFile logOn
> FINE: Logged onto ftp://netarchive:**********@wc01:21
> Oct 12, 2009 9:09:39 AM
> dk.netarkivet.common.distribute.FTPRemoteFile
> cleanup
> FINE: Deleting file
> 'ID:4603236-127.0.1.1(dc:3d:29:9c:f1:72)-59123-
> 1255331373630880710362039463516batch_aggregation-9132-
> 1255331379628'
> from ftp server
> Oct 12, 2009 9:09:39 AM
> dk.netarkivet.common.distribute.FTPRemoteFile logOn
> FINE: Logged onto ftp://netarchive:**********@wc01:21
> Oct 12, 2009 9:09:39 AM
> dk.netarkivet.common.distribute.FTPRemoteFile
> cleanup
> FINE: Deleting file
> 'ID:4603236-127.0.1.1(dc:3d:29:9c:f1:72)-59123-
> 1255331373630880710362039463516batch_aggregation-9132-
> 1255331379628'
> from ftp server
> Oct 12, 2009 9:09:39 AM
> dk.netarkivet.common.distribute.FTPRemoteFile logOn
> FINE: Logged onto ftp://netarchive:**********@wc01:21
> Oct 12, 2009 9:09:39 AM
> dk.netarkivet.archive.arcrepository.ArcRepository processCheckSum
> FINE: Checksum received ... processing
> Oct 12, 2009 9:09:39 AM
> dk.netarkivet.archive.arcrepository.ArcRepository processCheckSum
> WARNING: Cannot upload (wrong checksum)
> '2832-10-20090929075126-00033-webcrawler06.onb.ac.at.arc.gz' to
> 'ONB_ONB_THE_BAMON', reported
> checksum='e9c6ce2485ace6f526ac065a0c86efd0'
> Oct 12, 2009 9:09:39 AM
> dk.netarkivet.archive.arcrepositoryadmin.UpdateableAdminData write
> FINE: appending entry for filename
> '2832-10-20090929075126-00033-webcrawler06.onb.ac.at.arc.gz' to
> admin.data
> Oct 12, 2009 9:09:39 AM
> dk.netarkivet.archive.arcrepository.ArcRepository replyNotOK
> WARNING: Store NOT OK:
> '2832-10-20090929075126-00033-webcrawler06.onb.ac.at.arc.gz'
> Oct 12, 2009 9:09:39 AM
> dk.netarkivet.archive.arcrepository.ArcRepository replyNotOK
> FINE: Sending store NOT OK reply to message
> 'ID:21-172.16.14.149(89:7:bd:29:3a:2d)-43376-1255331165868: To
> ONB_COMMON_THE_REPOS ReplyTo
> ONB_COMMON_THIS_REPOS_CLIENT_172_16_14_149_HCA_8801HARVESTER Error:
> Failure while trying to store ARC file:
> 2832-10-20090929075126-00033-webcrawler06.onb.ac.at.arc.gz Arcfile:
> 2832-10-20090929075126-00033-webcrawler06.onb.ac.at.arc.gz'
> Oct 12, 2009 9:09:39 AM
> dk.netarkivet.common.distribute.JMSConnection reply
> INFO: Sending message to destination
> 'ONB_COMMON_THIS_REPOS_CLIENT_172_16_14_149_HCA_8801HARVESTER', ID
> =
> ID:21-172.16.14.149(89:7:bd:29:3a:2d)-43376-1255331165868
>
>
> _______________________________________________
> NetarchiveSuite-users mailing list
> NetarchiveSuite-users at lists.gforge.statsbiblioteket.dk
> https://lists.gforge.statsbiblioteket.dk/mailman/listinfo/netarchiv
> esuite-users




More information about the NetarchiveSuite-users mailing list