[Netarchivesuite-users] Corrupted ARC files

Nicchiarelli Eleonora eleonora.nicchiarelli at onb.ac.at
Tue Mar 2 14:27:43 CET 2010


Hi Søren,

I normally have no problem in uncompressing arc.gz files, unless I am grossly mistaken. Moreover, the file does not have the right size for an arc.gz and therefore I think it's safe to assume that some 80 MB of it have been lost. I am interested in knowing about tools that reconstruct the file as a complete and valid arc.gz - we would develop such a tool ourselves if it did not exist, but I suppose this has already been done by someone else. 

Eleonora

Eleonora Nicchiarelli Bettelli
Digital Preservation
Austrian National Library
Josefsplatz 1, 1015 Wien

Tel:  +43 1 53 410 686
Fax: +43 1 53 410 610
Web: http://www.onb.ac.at/
Mail: eleonora.nicchiarelli at onb.ac.at


> -----Ursprüngliche Nachricht-----
> Von: svc at kb.dk [mailto:netarchivesuite-users-
> bounces at lists.gforge.statsbiblioteket.dk] Im Auftrag von Søren Vejrup
> Carlsen
> Gesendet: Dienstag, 02. März 2010 13:16
> An: netarchivesuite-users at lists.gforge.statsbiblioteket.dk
> Betreff: Re: [Netarchivesuite-users] Corrupted ARC files
> 
> Hi Eleonora.
> I believe you can't use gunzip to uncompress the arc.gz file, as the
> arc.gz generated by the ARCWriter is compressed on the ARCRecord level,
> and not the file-level as expected by gunzip.
> 
> Regards
> Søren
> 
> -----Oprindelig meddelelse-----
> Fra: netarchivesuite-users-bounces at lists.gforge.statsbiblioteket.dk
> [mailto:netarchivesuite-users-bounces at lists.gforge.statsbiblioteket.dk] På
> vegne af Nicchiarelli Eleonora
> Sendt: 2. marts 2010 12:56
> Til: netarchivesuite-users at lists.gforge.statsbiblioteket.dk
> Emne: Re: [Netarchivesuite-users] Corrupted ARC files
> 
> Hi Søren,
> 
> the symptoms of corruption that we have had are the following:
> 
> In the System State the BitArchiveMonitor has reported that for a file
> there was a failed upload.
> After inspecting the status of the file both on the crawler machine and on
> the repository to which we were uploading, we have found that already on
> the crawler machine the arc.gz file size was ~19 MB, and it was not
> possible to unzip it conventionally.
> Therefore we have concluded that the file had been corrupted right after
> being generated (possibly a case of bit rot on the crawler machine).
> 
> Now we would like to "extend" the file to a valid arc.gz file so that it
> is possible to view its contents through wayback all the same (which is
> not possible at the moment). Is there a standard way to do this?
> 
> Thanks in advance,
> 
> Eleonora
> 
> 
> Eleonora Nicchiarelli Bettelli
> Digital Preservation
> Austrian National Library
> Josefsplatz 1, 1015 Wien
> 
> Tel:  +43 1 53 410 686
> Fax: +43 1 53 410 610
> Web: http://www.onb.ac.at/
> Mail: eleonora.nicchiarelli at onb.ac.at
> 
> 
> > -----Ursprüngliche Nachricht-----
> > Von: svc at kb.dk [mailto:netarchivesuite-users-
> > bounces at lists.gforge.statsbiblioteket.dk] Im Auftrag von Søren Vejrup
> > Carlsen
> > Gesendet: Dienstag, 02. März 2010 12:21
> > An: netarchivesuite-users at lists.gforge.statsbiblioteket.dk
> > Betreff: Re: [Netarchivesuite-users] Corrupted ARC files
> >
> > Hi Eleonora.
> > What kind of corruption are we talking about?
> > /Søren
> > -----Oprindelig meddelelse-----
> > Fra: netarchivesuite-users-bounces at lists.gforge.statsbiblioteket.dk
> > [mailto:netarchivesuite-users-bounces at lists.gforge.statsbiblioteket.dk]
>> > vegne af Nicchiarelli Eleonora
> > Sendt: 2. marts 2010 11:46
> > Til: netarchivesuite-users at lists.gforge.statsbiblioteket.dk
> > Emne: [Netarchivesuite-users] Corrupted ARC files
> >
> > Dear all,
> >
> > do you currently use utilities to repair corrupted ARC files, and if so,
> > which?
> >
> > Thanks in advance,
> >
> > Eleonora
> >
> > Eleonora Nicchiarelli Bettelli
> > Digital Preservation
> > Austrian National Library
> > Josefsplatz 1, 1015 Wien
> >
> > Tel:  +43 1 53 410 686
> > Fax: +43 1 53 410 610
> > Web: http://www.onb.ac.at/
> > Mail: eleonora.nicchiarelli at onb.ac.at
> >
> >
> >
> >
> >
> > _______________________________________________
> > NetarchiveSuite-users mailing list
> > NetarchiveSuite-users at lists.gforge.statsbiblioteket.dk
> >
> https://lists.gforge.statsbiblioteket.dk/mailman/listinfo/netarchivesuite-
> > users
> >
> > _______________________________________________
> > NetarchiveSuite-users mailing list
> > NetarchiveSuite-users at lists.gforge.statsbiblioteket.dk
> >
> https://lists.gforge.statsbiblioteket.dk/mailman/listinfo/netarchivesuite-
> > users
> 
> 
> 
> _______________________________________________
> NetarchiveSuite-users mailing list
> NetarchiveSuite-users at lists.gforge.statsbiblioteket.dk
> https://lists.gforge.statsbiblioteket.dk/mailman/listinfo/netarchivesuite-
> users
> 
> _______________________________________________
> NetarchiveSuite-users mailing list
> NetarchiveSuite-users at lists.gforge.statsbiblioteket.dk
> https://lists.gforge.statsbiblioteket.dk/mailman/listinfo/netarchivesuite-
> users






More information about the NetarchiveSuite-users mailing list