[Netarchivesuite-devel] RE SV: ? The steps to sanitest the feature request 1689 - Managing crawls using object number

Tue Larsen tlr at kb.dk
Wed Sep 16 18:02:39 CEST 2009


Hi Nicolas

Thanks  for the sanity tests!  You should put it in the tracker next time.

Could someone please, give Nicolas permission to write to the list	
netarchivesuite-devel at lists.gforge.statsbiblioteket.dk. He says,he is not allowed to do that...

Best regards

Tue


______________________________________
Fra: nicolas.giraud at bnf.fr [nicolas.giraud at bnf.fr]
Sendt: 16. september 2009 17:16
Til: Tue Larsen
Emne: RE SV: [Netarchivesuite-devel] ? The steps to sanitest the feature request 1689 - Managing crawls using object number

Hi Tue,

I have performed sanity tests on my dev environment. Here are the steps on a fresh install of NAS:

Sanity test 1: snapshot harvest

- add a set of domains
- configure the domains' default configuration object limit. On my dev setup I added 8 domains, and made two groups, one with a limit of 100 objects, some newspâpers domains with a 200 object limit, and an "outsider" with a 100 object limit.
- define a first snapshot harvest, with no byte limit and an object limit that is lower than the smallest domain config limit you set up (I started at 50).
- activate this harvest and let it finish.
- Verify that the stop reasons for domains, once the harvest is complete, are one of:
        - "Domain completed" with a number of harvested documents that is lower than the snapshot limit (in my test < 50)
        - "Max object limit reached" with a number of harvested documents that is equal to the snapshot limit (in my test 50)
- Verify that the QuotaEnforcer parameter "group-max-fetch-success" is set to the proper limit value in the order.xml files from the metadata arc

Sanity test 2: incremental snapshot harvest

- define a new snapshot harvest, with no byte limit and an object limit that is higher than the highest domain config limit you set (in my case 500). Make this harvest incremental by having it harvest only domains not completed in your initial harvest
- Verify that the stop reasons for domains, once the harvest is complete, are one of:
        - "Domain completed" with a number of harvested documents that is lower than the snapshot limit (in my test < 500)
        - "Max object limit reached" with a number of harvested documents that is equal to the snapshot limit (in my test 500)
        - "Domain-config object limit reached" with a number of harvested documents that is equal to the default domain configuration limit. That stop reason might be tricky to observe because of                      deduplication yields "Domain completed" more often on consecutive crawls.
- Verify that the QuotaEnforcer parameter "group-max-fetch-success" is set to the proper limit value in the order.xml files from the metadata arc

Sanity test 3: selective harvest

- pick a domain and create a new configuration for it, with an object limit.
- create a new selective harvest, add the domain and select the newly created config.
- activate the harvest and let it complete
-  Verify that the stop reasons for the selected domain is  "Domain-config object limit reached" with a number of harvested documents that is equal to the selected domain configuration limit.
- Verify that the QuotaEnforcer parameter "group-max-fetch-success" is set to the proper limit value in the order.xml file from the metadata arc

Sanity test 4: combination of object and byte limit

- pick a domain and create a new configuration for it, with an object limit , and a low byte limit (for instance 100ko and 1000 objects)
- create a new selective harvest, add the domain and select the newly created config.
- activate the harvest and let it complete
- Verify that the stop reasons for the selected domain is  "Domain-config byte limit reached" with a byte size that is equal to the selected domain configuration limit.
- Verify that the QuotaEnforcer "group-max-fetch-success" and "group-max-all-kb" parameters are set to the proper limit values in the order.xml file from the metadata arc


- pick a domain and create a new configuration for it, with a small object limit , and a high byte limit (for instance 10Mo and 10 objects)
- create a new selective harvest, add the domain and select the newly created config.
- activate the harvest and let it complete
-  Verify that the stop reasons for the selected domain is  "Domain-config object limit reached" with a number of harvested documents that is equal to the selected domain configuration limit.
- Verify that the QuotaEnforcer "group-max-fetch-success" and "group-max-all-kb" parameters are set to the proper limit values in the order.xml file from the metadata arc


Do we have a place in the wiki to document sanity tests, or should I put this in the tracker?

Best regards,

Nicolas

PS: can you please forward this message to netarchivesuite-devel at lists.gforge.statsbiblioteket.dk, because I'm not allowed to write to this list and get bounced. Thanks :)


Avant d'imprimer, pensez à l'environnement.
Consider the environment before printing this mail.




More information about the Netarchivesuite-devel mailing list