[Netarchivesuite-users] Regular expression (in Java) slowness fixed?
Peter Svanberg
Peter.Svanberg at kb.se
Mon Jul 8 17:43:25 CEST 2019
Hello!
I heard in Madrid something about Java regular expression handling being slow and that this problem was solved in some way.
* In which NAS version was it fixed? (I searched in vain in release notes.)
* How?
* I read that using non-capturing groups ( "(?:foo|bar)" instead of " (foo|bar) ") could save time and memory in intensive regex handling, have anyone considered that, or other type of optimization? Or is regex checking (in the fixed version) a negligible aspect of the crawling time for an URI - even if there are hundreds of crawler trap regexes?
Regards,
-----
Peter Svanberg
National Library of Sweden
Phone: +46 10 709 32 78
E-mail: peter.svanberg at kb.se
Web: www.kb.se
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://ml.sbforge.org/pipermail/netarchivesuite-users/attachments/20190708/4d418b37/attachment.html>
More information about the NetarchiveSuite-users
mailing list