[Netarchivesuite-users] Statistic data
aponb at gmx.at
aponb at gmx.at
Fri Jul 4 11:25:32 CEST 2008
>
> Good question.
>
> Try looking at the statistics on the job page instead (I assume this is only one job). You will realise that the database (for good reasons)
> only have statistics for domains known by the system (and included in the job) - so objects (and bytes) harvested from other domains (e.g.
> inline material) are not counted in in the database (exactly since they are from other domains possibly unknown to the system - and at least
> unknown to the job).
>
> We have talked about registering inline material on the domain it is inlined in to exactly fix this "problem" - could be in the same figures
> per domain or could be in a seperate set of figures - e.g. called "inline material" so that each domain have 2 sets of figures per job.
>
>
Yes these data were from only one job and if I just count the URLs which
belongs to the domain of the jobs I get exact the number of documents
which will be reported by the database.
Thanks for the information
More information about the NetarchiveSuite-users
mailing list