[Netarchivesuite-users] DNS preparation, domain state active?
Peter Svanberg
Peter.Svanberg at kb.se
Tue Nov 2 17:11:29 CET 2021
Someone (Tue?) talked about a domain having some status, “active” and others. I can’t find that, please explain.
/Peter
Från: Peter Svanberg
Skickat: den 2 november 2021 09:26
Till: 'netarchivesuite-users at ml.sbforge.org' <netarchivesuite-users at ml.sbforge.org>
Ämne: DNS in NAS/Heritrik
Hello!
In preparing for a new broad crawl we gather info on new domain names. Some issues around that.
1) I have a memory of reading about some of you had done some scripts concerning domains and seeds. What was that about? What around this does NAS/Heritrix not handle on its own?
2) Well, one thing we’ve found is that NAS assumes that a domain X answers on URL http://www.X, but that is not true. We found hundreds of domains which have no www.X<http://www.X> host but answers on http://X. Maybe this should be changed in some way in NAS?
Med vänlig hälsning
[KB Logo]<https://www.kb.se/>
Peter Svanberg
Teknisk handläggare
Insamling och metadata
Insamling 1
Kungliga biblioteket
Box 5039, 102 41 Stockholm
Besöksadress: Karlavägen 96, Stockholm
+46 10 709 32 78
Peter.Svanberg at kb.se<mailto:Peter.Svanberg at kb.se>
www.kb.se<https://www.kb.se/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://ml.sbforge.org/pipermail/netarchivesuite-users/attachments/20211102/ea7d895c/attachment-0001.html>
More information about the NetarchiveSuite-users
mailing list