[Netarchivesuite-users] File to exclude domains?
sara.aubry at bnf.fr
sara.aubry at bnf.fr
Fri Oct 20 09:29:23 CEST 2023
Hello Peter,
We use Heritrix exclude.txt mechanism which you can activate with the
following bean in your profile:
<bean id="rejectExcludedSurts"
class="org.archive.modules.deciderules.surt.SurtPrefixedDecideRule">
<!-- Decision value (ACCEPT, REJECT, NONE) -->
<property name="decision" value="REJECT" />
<property name="surtsSourceFile" value="/dlweb/data/nas/exclude.txt"
/>
<property name="seedsAsSurtPrefixes" value="false" />
<property name="alsoCheckVia" value="false" />
<property name="surtsDumpFile" value="/dlweb/data/nas/exclude.dump"
/>
</bean>
Best,
Sara
De : "Peter Svanberg" <Peter.Svanberg at kb.se>
A : "netarchivesuite-users at ml.sbforge.org"
<netarchivesuite-users at ml.sbforge.org>
Date : 19/10/2023 21:11
Objet : [Netarchivesuite-users] File to exclude domains?
Envoyé par : "NetarchiveSuite-users"
<netarchivesuite-users-bounces at ml.sbforge.org>
I have a definite recollection of Sara talking about a file you can create
containing domain names to be excluded from a snapshot. But I can't find
any info on that anywhere. (Other than NAS-1725 but not what was done with
that.) Can someone remind me?
(I know you can configure with zeros but a list in a file would be
easier.)
-----
Peter Svanberg
National Library of Sweden
_______________________________________________
NetarchiveSuite-users mailing list
NetarchiveSuite-users at ml.sbforge.org
https://ml.sbforge.org/mailman/listinfo/netarchivesuite-users
Expositions Épreuves de la matière du 10 octobre 2023 au 4 février 2024 et Noir & Blanc : une esthétique de la photographie du 17 octobre 2023 au 21 janvier 2024 | François-Mitterrand. Participez à l’acquisition du bréviaire de Charles V, très rare manuscrit enluminé du XIV e siècle Avant d'imprimer, pensez à l'environnement.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://ml.sbforge.org/pipermail/netarchivesuite-users/attachments/20231020/347373cb/attachment.html>
More information about the NetarchiveSuite-users
mailing list