[Netarchivesuite-users] File to exclude domains?

bert.wendland at bnf.fr bert.wendland at bnf.fr
Fri Oct 20 09:37:28 CEST 2023


Hello Peter,

This is what you are looking for:

                <bean id="rejectExcludedSurts" 
class="org.archive.modules.deciderules.surt.SurtPrefixedDecideRule">
                    <property name="decision" value="REJECT" />
                    <property name="surtsSourceFile" value="exclude.txt" 
/>
                    <property name="seedsAsSurtPrefixes" value="false" />
                    <property name="alsoCheckVia" value="false" />
                    <property name="surtsDumpFile" value="exclude.dump" />
                </bean>

Add it to your DecideRuleSequence. exclude.txt may contain SURTs, 
domains/hosts or even URLs.

Bert



De :    "Peter Svanberg" <Peter.Svanberg at kb.se>
A :     "netarchivesuite-users at ml.sbforge.org" 
<netarchivesuite-users at ml.sbforge.org>
Date :  19/10/2023 21:11
Objet : [Netarchivesuite-users] File to exclude domains?
Envoyé par :    "NetarchiveSuite-users" 
<netarchivesuite-users-bounces at ml.sbforge.org>



I have a definite recollection of Sara talking about a file you can create 
containing domain names to be excluded from a snapshot. But I can't find 
any info on that anywhere. (Other than NAS-1725 but not what was done with 
that.) Can someone remind me?
 
(I know you can configure with zeros but a list in a file would be 
easier.)
-----

Peter Svanberg
National Library of Sweden
_______________________________________________
NetarchiveSuite-users mailing list
NetarchiveSuite-users at ml.sbforge.org
https://ml.sbforge.org/mailman/listinfo/netarchivesuite-users


Expositions  Épreuves de la matière  du 10 octobre 2023 au 4 février 2024 et  Noir & Blanc : une esthétique de la photographie  du 17 octobre 2023 au 21 janvier 2024 | François-Mitterrand. Participez à l’acquisition du bréviaire de Charles V, très rare manuscrit enluminé du XIV e  siècle Avant d'imprimer, pensez à l'environnement. 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://ml.sbforge.org/pipermail/netarchivesuite-users/attachments/20231020/bb93bad6/attachment.html>


More information about the NetarchiveSuite-users mailing list