[Netarchivesuite-users] Activating object count limit in harvest definitions, configurations and jobs

nicolas.giraud at bnf.fr nicolas.giraud at bnf.fr
Wed Aug 19 17:26:28 CEST 2009


Hi,

My current task at BnF is to allow using URL count as a domain budget, 
instead of data size. I have browsed the code and have found that 
everything works at the data model and DAO level. 

One of my concerns is to understand how this will impact the process of 
splitting a harvest definition into jobs. If I have understood things 
correctly, the critical code for this is located in the method 
dk.netarkivet.harvester.datamodel.Job#canAccept(DomainConfiguration). I 
would like to have some textual explanation of the calculations performed 
here, I am not fully understanding what happens just by reading the code. 
If using URL count for the budget, size limit should be set to -1 
(Constants.HERITRIX_MAXBYTES_INFINITY)?

My next concern is to insert the proper configuration in order.xml, but 
prior to asking more info about this, I have to read some doc ;)

Cheers,
Nicolas



Avant d'imprimer, pensez à l'environnement. 
Consider the environment before printing this mail.   
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://ml.sbforge.org/pipermail/netarchivesuite-users/attachments/20090819/779c2b59/attachment-0002.html>


More information about the NetarchiveSuite-users mailing list