Sortix nightly manual
This manual documents Sortix nightly, a development build that has not been officially released. You can instead view this document in the latest official manual.
PCREPERFORM(3) | Library Functions Manual | PCREPERFORM(3) |
NAME
PCRE - Perl-compatible regular expressionsPCRE PERFORMANCE
Two aspects of performance are discussed below: memory usage and processing time. The way you express your pattern as a regular expression can affect both of them.COMPILED PATTERN MEMORY USAGE
Patterns are compiled by PCRE into a reasonably efficient interpretive code, so that most simple patterns do not use much memory. However, there is one case where the memory usage of a compiled pattern can be unexpectedly large. If a parenthesized subpattern has a quantifier with a minimum greater than 1 and/or a limited maximum, the whole subpattern is repeated in the compiled code. For example, the pattern(abc|def){2,4}
(abc|def)(abc|def)((abc|def)(abc|def)?)?
((ab){1,1000}c){1,3}
((ab)(?2){0,999}c)(?1){0,2}
STACK USAGE AT RUN TIME
When pcre_exec() or pcre[16|32]_exec() is used for matching, certain kinds of pattern can cause it to use large amounts of the process stack. In some environments the default process stack is quite small, and if it runs out the result is often SIGSEGV. This issue is probably the most frequently raised problem with PCRE. Rewriting your pattern can often help. The pcrestack documentation discusses this issue in detail.PROCESSING TIME
Certain items in regular expression patterns are processed more efficiently than others. It is more efficient to use a character class like [aeiou] than a set of single-character alternatives such as (a|e|i|o|u). In general, the simplest construction that provides the required behaviour is usually the most efficient. Jeffrey Friedl's book contains a lot of useful general discussion about optimizing regular expressions for efficient performance. This document contains a few observations about PCRE..*second
^(a+)*
(a+)*b
(a+)*\d
09 January 2012 | PCRE 8.30 |