[Netarchivesuite-devel] BnF ready to commit changes
nicolas.giraud at bnf.fr
nicolas.giraud at bnf.fr
Thu Apr 15 17:34:45 CEST 2010
Hi Søren,
The changes have been committed. Here's a quick breakdown of the new
features :
- PostgreSQL connectivity (using the PostgreSQL driver version 8.4 - JDBC
4)
- New sort criteria and results pagination in the Harvest status main page
- New page "Running jobs" that monitors jobs being crawled, and allows to
search which active jobs are crawling a specific domain (in case the
webmaster is complaining ;) )
- HeritrixLauncher is now abstract and instancated through a factory
method. This allows to propose different implementations of the crawl
control loop.
DefaultHeritrixLauncher is the default legacy implementation, and I have
introduced a slightly different implementation for BnF, that comes along
with an
optimized Heritrix JMX controller that solves the connection loss issue
(BnFHeritrixController). I didn't want to impact the legacy
implementation, hence I made all this
pluggable, leaving you guys to judge wheher this implementation interests
you too or not.
- a configurable wait period after ending a crawl and before sending the
shutdown command to Heritrix, to allow the report generation to complete.
- an updated french translation
And things to fix/enhance :
- new strings to translate in harvester translation.properties files, I've
only translated them to french.
- the pagination mechanism in the Harvest status main page relies on the
LIMIT and OFFSET syntax. Though it is not SQL standard, this syntax is
supported by many DB systems, MySQL and PostgreSQL in particular.
Unfortunately Derby does not support it (cf.
http://db.apache.org/derby/faq.html#limit), so this feature
is currently broken if the installation uses a Derby DB.
We should probably do a series of reviews when I come back from paternity
leave on monday may 24th.
Best,
Nicolas Giraud
Avant d'imprimer, pensez à l'environnement.
Consider the environment before printing this mail.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://ml.sbforge.org/pipermail/netarchivesuite-devel/attachments/20100415/18fa970a/attachment-0002.html>
More information about the Netarchivesuite-devel
mailing list