[Netarchivesuite-users] Change all redirecting http to https in data base?

Peter Svanberg Peter.Svanberg at kb.se
Wed Mar 24 13:16:12 CET 2021


Hello!

I’ve seen some problems connected to http and https URL:s for the same domain and have a faint memory of that some of you wrote something about going through all domains and corrected the http/https part of the seeds in NAS. Is that true? What did you do? And do you have tools to share?

One question is the criterion for

Should the seed be changed?


1)      A http-request (with HTTP/1.1) to it gives 301 (Moved Permanently) or 308 (Permanent Redirect) and a Location header

2)      As (1) but also for answer 302 (“Found” in HTTP/1.1, “Moved Temporarily” in HTTP/1.0)?

(Somewhat later:) I hacked a python program to analyse DNS and redirects, and found, for 5904 seeds constructed as http://www.domain names:

noRedir

36,20%

redirToHTTPS

16,21%

Just changed to https

dns_error

12,35%

redirToOther

11,01%

Changed URL, still http

redirToNonWWW

9,86%

Just removed "www."

otherError

6,17%

forbidden

2,29%

redirToHTTPSNonWWW

2,08%

Removed "www.", changed to https

otherCloseError

1,90%

timeout

1,66%

connect

0,29%


and the distribution between 301 and 302 was 87–13 %.

Any views on this?
-----

Peter Svanberg

National Library of Sweden
Phone: +46 10 709 32 78

E-mail: peter.svanberg at kb.se
Web: www.kb.se



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://ml.sbforge.org/pipermail/netarchivesuite-users/attachments/20210324/e0c2e6ae/attachment.html>


More information about the NetarchiveSuite-users mailing list