[Netarchivesuite-users] metadatafile arc files

Søren Vejrup Carlsen svc at kb.dk
Fri Jun 20 12:34:36 CEST 2008


Hi there.
It is possible to upload arc files to the NetarchiveSuite archive.
This is done using the tool:  dk.netarkivet.archive.tools.Upload

It is correct, that the metadata.arc are made by the NetarchiveSuite, and contains
metadata about the job, the heritrix logs, an cdx-index of contents of the arcfiles, that Heritrix generated. 

However, we don't recommend adding bogus job-information to the database so that you can
browse the arcfiles using the viewerproxy.

The viewerproxy is only intended for local quality assurance testing of the harvested material.

We at netarkivet.dk plan to use the Internet Archive Wayback machine for access to our harvested material.
See  http://crawler.archive.org

/Søren

> -----Original Message-----
> From: netarchivesuite-users-bounces at lists.gforge.statsbiblioteket.dk
> [mailto:netarchivesuite-users-bounces at lists.gforge.statsbiblio
> teket.dk]O
> n Behalf Of aponb at gmx.at
> Sent: Friday, June 20, 2008 11:48 AM
> To: netarchivesuite-users at lists.gforge.statsbiblioteket.dk
> Subject: [Netarchivesuite-users] metadatafile arc files
> 
> 
> Is it possible to add arc files to the suite which were produced by a 
> heritrix instance without using the netarchive suite?
> Of course I would need to insert the correct data into the 
> database, but 
> what about the metadata.arc file? Can I generate this file, 
> based on my 
> heretrix arc files? Am I right, when I say that file is a 
> special file 
> made by the netarchive suite and is not generated by the heritrix 
> crawler? All statistics and log messages are written in that file. I 
> also assume accessing arc files which have deduplicated data wouldn't 
> work without these metadata files. Is this correct?
> Thanks for reading!
> _______________________________________________
> NetarchiveSuite-users mailing list
> NetarchiveSuite-users at lists.gforge.statsbiblioteket.dk
> https://lists.gforge.statsbiblioteket.dk/mailman/listinfo/neta
> rchivesuite-users
> 




More information about the NetarchiveSuite-users mailing list