[Netarchivesuite-users] Your URI/sec and KB/sec figures?
Peter Svanberg
Peter.Svanberg at kb.se
Mon Jun 24 11:38:51 CEST 2019
Hello!
I discovered a Heritrix mailinglist(*). Amongst some interesting tips on making the crawl faster, I also read some speed figures far from what we ever get. So I ask you: what do you get as speed values?
Our latest 19 selective harvests have the following figures (from crawl-report.txt in the jobs metadata WARC file):
URIs/sec: slowest job 0,83; fastest job 9,8; average 5,11
KB/sec: slowest 34; fastest 863; average 313
(I realize that this besides NAS/Heritrix configuration depends much on hardware, memory, disk I/O, network capacity etc. but don't know which such figures that are most relevant to add to this comparison. Suggestions?)
* https://groups.yahoo.com/neo/groups/archive-crawler/conversations/messages
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://ml.sbforge.org/pipermail/netarchivesuite-users/attachments/20190624/3cfa33d0/attachment.html>
More information about the NetarchiveSuite-users
mailing list