[Netarchivesuite-users] Statistic data

aponb at gmx.at aponb at gmx.at
Fri Jul 4 11:25:32 CEST 2008


>
> Good question.
>
> Try looking at the statistics on the job page instead (I assume this is only one job). You will realise that the database (for good reasons) 
> only have statistics for domains known by the system (and included in the job) - so objects (and bytes) harvested from other domains (e.g. 
> inline material) are not counted in in the database (exactly since they are from other domains possibly unknown to the system - and at least 
> unknown to the job).
>
> We have talked about registering inline material on the domain it is inlined in to exactly fix this "problem" - could be in the same figures 
> per domain or could be in a seperate set of figures - e.g. called "inline material" so that each domain have 2 sets of figures per job.
>
>   

Yes these data were from only one job and if I just count the URLs which 
belongs to the domain of the jobs I get exact the number of documents  
which will be reported by the database.
Thanks for the information



More information about the NetarchiveSuite-users mailing list