[Netarchivesuite-users] Regular expression (in Java) slowness fixed?

Peter Svanberg Peter.Svanberg at kb.se
Mon Jul 8 17:43:25 CEST 2019


Hello!

I heard in Madrid something about Java regular expression handling being slow and that this problem was solved in some way.


*         In which NAS version was it fixed? (I searched in vain in release notes.)

*         How?

*         I read that using non-capturing groups ( "(?:foo|bar)" instead of " (foo|bar) ") could save time and memory in intensive regex handling, have anyone considered that, or other type of optimization? Or is regex checking (in the fixed version) a negligible aspect of the crawling time for an URI - even if there are hundreds of crawler trap regexes?

Regards,

-----

Peter Svanberg

National Library of Sweden
Phone: +46 10 709 32 78

E-mail: peter.svanberg at kb.se
Web: www.kb.se



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://ml.sbforge.org/pipermail/netarchivesuite-users/attachments/20190708/4d418b37/attachment.html>


More information about the NetarchiveSuite-users mailing list