[Netarchivesuite-users] Strange slow non-existing-domain behavior

Peter Svanberg Peter.Svanberg at kb.se
Wed Mar 20 23:32:40 CET 2019


Hello again!

Spurred by your previous problem-solving answers, I continue.

Strange Heritrix behavior: Do dns lookup, which fails. Report that with an -6 line. Then 10 minutes pause. Then a new dns lookup and so on.

What happens during the pause? Waiting for dns lookup in 600 seconds? Trying the request despite the failed lookup?

(Maybe one of the bugs fixed in 5.5?)

Log and template below.

Best regards,
-----

Peter Svanberg
Technical officer
Digital Collections Department, Newspapers, Radio and Television Division

National Library of Sweden
<x-apple-data-detectors://1/1>PO Box 5039<x-apple-data-detectors://1/1>
SE-104 51 Stockholm<x-apple-data-detectors://1/1>
Visits: <x-apple-data-detectors://2> Karlavägen 100, Stockholm <x-apple-data-detectors://2>
Phone<x-apple-data-detectors://2>: +46 10 709 32 78

E-mail: peter.svanberg at kb.se<mailto:peter.svanberg at kb.se>
Web: www.kb.se<http://www.kb.se/>




crawl log:

2019-03-20T21:48:42.119Z    -6          - http://lookbackvideo7-a.akamaihd.net/ RRX https://www.facebook.com/ unknown #033 - - http://www.fbcdn.net 2t
2019-03-20T21:48:41.164Z    -1          - dns:lookbackvideo7-a.akamaihd.net<http://a.akamaihd.net> RRXP http://lookbackvideo7-a.akamaihd.net/ text/dns #047 20190320214841119+45 - http://www.fbcdn.net 3t
2019-03-20T21:38:41.006Z    -6          - http://lookbackvideo6-a.akamaihd.net/ RRX https://www.facebook.com/ unknown #024 - - http://www.fbcdn.net 2t
2019-03-20T21:38:40.063Z    -1          - dns:lookbackvideo6-a.akamaihd.net<http://a.akamaihd.net> RRXP http://lookbackvideo6-a.akamaihd.net/ text/dns #026 20190320213840006+56 - http://www.fbcdn.net 3t
2019-03-20T21:28:39.896Z    -6          - http://lookbackvideo5-a.akamaihd.net/ RRX https://www.facebook.com/ unknown #045 - - http://www.fbcdn.net 2t
2019-03-20T21:28:38.942Z    -1          - dns:lookbackvideo5-a.akamaihd.net<http://a.akamaihd.net> RRXP http://lookbackvideo5-a.a

template:

fetchDns.enabled=true
fetchDns.acceptNonDnsResolves=false
fetchDns.digestContent=true
fetchDns.digestAlgorithm=sha1

fetchHttp.enabled=true
fetchHttp.timeoutSeconds=1200
fetchHttp.soTimeoutMs=20000
fetchHttp.maxFetchKBSec=0
fetchHttp.maxLengthBytes=0
fetchHttp.ignoreCookies=false
fetchHttp.sslTrustLevel=OPEN
fetchHttp.defaultEncoding=UTF-8
fetchHttp.digestContent=true
fetchHttp.digestAlgorithm=sha1
fetchHttp.sendIfModifiedSince=true
fetchHttp.sendIfNoneMatch=true
fetchHttp.sendConnectionClose=true
fetchHttp.sendReferer=true
fetchHttp.sendRange=false


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://ml.sbforge.org/pipermail/netarchivesuite-users/attachments/20190320/caf591e3/attachment.html>


More information about the NetarchiveSuite-users mailing list