IMPORT util DEFINE s STRING DEFINE r util.regexp DEFINE result BOOLEAN CONSTANT TEST_PATTERN = `^-?(0|[1-9]\\d*)(\\.\\d+)?$` MAIN DEFINE i INTEGER LET s= "FOO100" LET r = util.Regexp.compile(TEST_PATTERN) FOR i = 1 TO 10000 CALL test_string() CALL test_regexp() END FOR END MAIN FUNCTION test_string() LET result = s.matches(TEST_PATTERN) END FUNCTION FUNCTION test_regexp() LET result = r.matches(s) END FUNCTION
… you will get results similar to …
Flat profile (order by self) count %total %child %self name 10000 82.3 0.0 82.3 base.String.matches 10000 87.2 82.3 4.9 regexp_performance.test_string1 100.0 95.7 4.3 regexp_performance.main 10000 8.4 4.2 4.2 regexp_performance.test_regexp10000 4.2 0.0 4.2 util.Regexp.matches 1 0.1 0.0 0.1 util.Regexp.compile 1 0.0 0.0 0.0
The important thing to note is how much time was spent in the two functions, test_string(), test_regexp() The less time, the better.
The highlighted values show that less time is spent when using test_regexp() which uses the util.regexp class.
Why is this? using string.matches, each iteration the compile and matches step is being carried out. Using the regexp class the compilation step is carried out only once outside of the loop, and only the matches step is carried out every iteration. This is very similar to database queries and how using cursors mean the “how am I going to do it” calculation is carried out once rather than each time the SQL statement is executed.
So moral of story, any time you are going to be reusing multiple times the same pattern match with a regular expression, you are going to want to investigate using the util.Regexp class so that the pattern is only compiled once.
This also highlights how you can use the profiler to examine the performance of different coding techniques. Create a simple test and see how much relative time is spent in each function.