Perl Compatible Regular Expressions

[3] PCRE's syntax is much more powerful and flexible than either of the POSIX regular expression flavors (BRE, ERE)[4] and than that of many other regular-expression libraries.

While PCRE originally aimed at feature-equivalence with Perl, the two implementations are not fully equivalent.

During the PCRE 7.x and Perl 5.9.x phase, the two projects coordinated development, with features being ported between them in both directions.

The original software, now called PCRE1 (the 1.xx–8.xx series), has had bugs mended, but no further development.

A number of prominent open-source programs, such as the Apache and Nginx HTTP servers, and the PHP and R scripting languages, incorporate the PCRE library; proprietary software can do likewise, as the library is BSD-licensed.

Large performance benefits are possible when (for example) the calling program utilizes the feature with compatible patterns that are executed repeatedly.

The just-in-time compiler support was written by Zoltan Herczeg and is not addressed in the POSIX wrapper.

The use of the system stack for backtracking can be problematic in PCRE1, which is why this feature of the implementation was changed in PCRE2.

Single-letter character classes are supported in addition to the longer POSIX names.

Matching of certain "normal" metacharacters can be driven by Unicode properties when the compile option PCRE2_UCP is set.

The option alters behavior of the following metacharacters: \B, \b, \D, \d, \S, \s, \W, \w, and some of the POSIX character classes.

Note that the UCP option requires the library to have been built to include Unicode support (this is the default for PCRE2).

Some applications using PCRE provide users with the means to apply this setting through an external option.

So the newline option can also be stated at the start of the pattern using one of the following: When not in UTF-8 mode, corresponding linebreaks can be matched with (?

Named subpatterns are a feature that PCRE adopted from Python regular expressions.

While a backreference provides a mechanism to refer to that part of the subject that has previously matched a subpattern, a subroutine provides a mechanism to reuse an underlying previously defined subpattern.

For example, the pattern \((a*|(?R))*\) will match any combination of balanced parentheses and "a"s. PCRE expressions can embed (?Cn), where n is some number.

This will call out to an external user-defined function through the PCRE API and can be used to embed arbitrary code in a pattern.

[7] Within lookbehind assertions, both PCRE and Perl require fixed-length patterns.

That is, both PCRE and Perl disallow variable-length patterns using quantifiers within lookbehind assertions.

Support for experimental backtracking control verbs (added in Perl 5.10) is available in PCRE since version 7.3.

Perl's corresponding use of arguments with backtracking control verbs is not generally supported.

construct, which is meaningless but harmless (albeit inefficient); PCRE produces an error in versions before 8.13.