[Netarchivesuite-users] DNS preparation, domain state active?

Peter Svanberg Peter.Svanberg at kb.se
Tue Nov 2 17:11:29 CET 2021


Someone (Tue?) talked about a domain having some status, “active” and others. I can’t find that, please explain.

/Peter



Från: Peter Svanberg
Skickat: den 2 november 2021 09:26
Till: 'netarchivesuite-users at ml.sbforge.org' <netarchivesuite-users at ml.sbforge.org>
Ämne: DNS in NAS/Heritrik

Hello!

In preparing for a new broad crawl we gather info on new domain names. Some issues around that.

1) I have a memory of reading about some of you had done some scripts concerning domains and seeds. What was that about? What around this does NAS/Heritrix not handle on its own?

2) Well, one thing we’ve found is that NAS assumes that a domain X answers on URL http://www.X, but that is not true. We found hundreds of domains which have no www.X<http://www.X> host but answers on http://X. Maybe this should be changed in some way in NAS?

Med vänlig hälsning


[KB Logo]<https://www.kb.se/>

Peter Svanberg
Teknisk handläggare
Insamling och metadata
Insamling 1

Kungliga biblioteket
Box 5039, 102 41 Stockholm
Besöksadress: Karlavägen 96, Stockholm
+46 10 709 32 78
Peter.Svanberg at kb.se<mailto:Peter.Svanberg at kb.se>
www.kb.se<https://www.kb.se/>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://ml.sbforge.org/pipermail/netarchivesuite-users/attachments/20211102/ea7d895c/attachment-0001.html>


More information about the NetarchiveSuite-users mailing list