This file contains the PCRE man page that describes the regular expressions
-supported by PCRE version 6.0. Note that not all of the features are relevant
+supported by PCRE version 6.2. Note that not all of the features are relevant
in the context of Exim. In particular, the version of PCRE that is compiled
with Exim does not include UTF-8 support, there is no mechanism for changing
the options with which the PCRE functions are called, and features such as
callout are not accessible.
-----------------------------------------------------------------------------
+PCREPATTERN(3) PCREPATTERN(3)
NAME
PCRETEST(1) PCRETEST(1)
-
NAME
pcretest - a program for testing Perl-compatible regular expressions.
+
SYNOPSIS
pcretest [-C] [-d] [-dfa] [-i] [-m] [-o osize] [-p] [-t] [source]
ChangeLog for PCRE
------------------
+Version 6.2 01-Aug-05
+---------------------
+
+ 1. There was no test for integer overflow of quantifier values. A construction
+ such as {1111111111111111} would give undefined results. What is worse, if
+ a minimum quantifier for a parenthesized subpattern overflowed and became
+ negative, the calculation of the memory size went wrong. This could have
+ led to memory overwriting.
+
+ 2. Building PCRE using VPATH was broken. Hopefully it is now fixed.
+
+ 3. Added "b" to the 2nd argument of fopen() in dftables.c, for non-Unix-like
+ operating environments where this matters.
+
+ 4. Applied Giuseppe Maxia's patch to add additional features for controlling
+ PCRE options from within the C++ wrapper.
+
+ 5. Named capturing subpatterns were not being correctly counted when a pattern
+ was compiled. This caused two problems: (a) If there were more than 100
+ such subpatterns, the calculation of the memory needed for the whole
+ compiled pattern went wrong, leading to an overflow error. (b) Numerical
+ back references of the form \12, where the number was greater than 9, were
+ not recognized as back references, even though there were sufficient
+ previous subpatterns.
+
+ 6. Two minor patches to pcrecpp.cc in order to allow it to compile on older
+ versions of gcc, e.g. 2.95.4.
+
+
+Version 6.1 21-Jun-05
+---------------------
+
+ 1. There was one reference to the variable "posix" in pcretest.c that was not
+ surrounded by "#if !defined NOPOSIX".
+
+ 2. Make it possible to compile pcretest without DFA support, UTF8 support, or
+ the cross-check on the old pcre_info() function, for the benefit of the
+ cut-down version of PCRE that is currently imported into Exim.
+
+ 3. A (silly) pattern starting with (?i)(?-i) caused an internal space
+ allocation error. I've done the easy fix, which wastes 2 bytes for sensible
+ patterns that start (?i) but I don't think that matters. The use of (?i) is
+ just an example; this all applies to the other options as well.
+
+ 4. Since libtool seems to echo the compile commands it is issuing, the output
+ from "make" can be reduced a bit by putting "@" in front of each libtool
+ compile command.
+
+ 5. Patch from the folks at Google for configure.in to be a bit more thorough
+ in checking for a suitable C++ installation before trying to compile the
+ C++ stuff. This should fix a reported problem when a compiler was present,
+ but no suitable headers.
+
+ 6. The man pages all had just "PCRE" as their title. I have changed them to
+ be the relevant file name. I have also arranged that these names are
+ retained in the file doc/pcre.txt, which is a concatenation in text format
+ of all the man pages except the little individual ones for each function.
+
+ 7. The NON-UNIX-USE file had not been updated for the different set of source
+ files that come with release 6. I also added a few comments about the C++
+ wrapper.
+
+
Version 6.0 07-Jun-05
---------------------
-/* $Cambridge: exim/src/src/pcre/dftables.c,v 1.2 2005/06/15 08:57:10 ph10 Exp $ */
+/* $Cambridge: exim/src/src/pcre/dftables.c,v 1.3 2005/08/08 10:22:14 ph10 Exp $ */
/*************************************************
* Perl-Compatible Regular Expressions *
return 1;
}
-f = fopen(argv[1], "w");
+f = fopen(argv[1], "wb");
if (f == NULL)
{
fprintf(stderr, "dftables: failed to open %s for writing\n", argv[1]);
-/* $Cambridge: exim/src/src/pcre/pcre.h,v 1.2 2005/06/15 08:57:10 ph10 Exp $ */
-
/*************************************************
* Perl-Compatible Regular Expressions *
*************************************************/
make changes to pcre.in. */
#define PCRE_MAJOR 6
-#define PCRE_MINOR 0
-#define PCRE_DATE 07-Jun-2005
+#define PCRE_MINOR 2
+#define PCRE_DATE 01-Aug-2005
/* Win32 uses DLL by default; it needs special stuff for exported functions. */
-/* $Cambridge: exim/src/src/pcre/pcre_compile.c,v 1.1 2005/06/15 08:57:10 ph10 Exp $ */
+/* $Cambridge: exim/src/src/pcre/pcre_compile.c,v 1.2 2005/08/08 10:22:14 ph10 Exp $ */
/*************************************************
* Perl-Compatible Regular Expressions *
int min = 0;
int max = -1;
+/* Read the minimum value and do a paranoid check: a negative value indicates
+an integer overflow. */
+
while ((digitab[*p] & ctype_digit) != 0) min = min * 10 + *p++ - '0';
+if (min < 0 || min > 65535)
+ {
+ *errorcodeptr = ERR5;
+ return p;
+ }
+
+/* Read the maximum value if there is one, and again do a paranoid on its size.
+Also, max must not be less than min. */
if (*p == '}') max = min; else
{
{
max = 0;
while((digitab[*p] & ctype_digit) != 0) max = max * 10 + *p++ - '0';
+ if (max < 0 || max > 65535)
+ {
+ *errorcodeptr = ERR5;
+ return p;
+ }
if (max < min)
{
*errorcodeptr = ERR4;
}
}
-/* Do paranoid checks, then fill in the required variables, and pass back the
-pointer to the terminating '}'. */
+/* Fill in the required variables, and pass back the pointer to the terminating
+'}'. */
-if (min > 65535 || max > 65535)
- *errorcodeptr = ERR5;
-else
- {
- *minp = min;
- *maxp = max;
- }
+*minp = min;
+*maxp = max;
return p;
}
BOOL class_utf8;
#endif
BOOL inescq = FALSE;
+BOOL capturing;
unsigned int brastackptr = 0;
size_t size;
uschar *code;
case '(':
branch_newextra = 0;
bracket_length = 1 + LINK_SIZE;
+ capturing = FALSE;
/* Handle special forms of bracket, which all start (? */
case 'P':
ptr += 3;
+
+ /* Handle the definition of a named subpattern */
+
if (*ptr == '<')
{
const uschar *p; /* Don't amalgamate; some compilers */
}
name_count++;
if (ptr - p > max_name_size) max_name_size = (ptr - p);
+ capturing = TRUE; /* Named parentheses are always capturing */
break;
}
+ /* Handle back references and recursive calls to named subpatterns */
+
if (*ptr == '=' || *ptr == '>')
{
while ((compile_block.ctypes[*(++ptr)] & ctype_word) != 0);
nothing is done here and it is handled during the compiling
process.
+ We allow for more than one options setting at the start. If such
+ settings do not change the existing options, nothing is compiled.
+ However, we must leave space just in case something is compiled.
+ This can happen for pathological sequences such as (?i)(?-i)
+ because the global options will end up with -i set. The space is
+ small and not significant. (Before I did this there was a reported
+ bug with (?i)(?-i) in a machine-generated pattern.)
+
[Historical note: Up to Perl 5.8, options settings at top level
were always global settings, wherever they appeared in the pattern.
That is, they were equivalent to an external setting. From 5.8
options = (options | set) & (~unset);
set = unset = 0; /* To save length */
item_count--; /* To allow for several */
+ length += 2;
}
/* Fall through */
continue;
}
- /* If options were terminated by ':' control comes here. Fall through
- to handle the group below. */
+ /* If options were terminated by ':' control comes here. This is a
+ non-capturing group with an options change. There is nothing more that
+ needs to be done because "capturing" is already set FALSE by default;
+ we can just fall through. */
+
}
}
- /* Extracting brackets must be counted so we can process escapes in a
- Perlish way. If the number exceeds EXTRACT_BASIC_MAX we are going to
- need an additional 3 bytes of store per extracting bracket. However, if
- PCRE_NO_AUTO)CAPTURE is set, unadorned brackets become non-capturing, so we
- must leave the count alone (it will aways be zero). */
+ /* Ordinary parentheses, not followed by '?', are capturing unless
+ PCRE_NO_AUTO_CAPTURE is set. */
+
+ else capturing = (options & PCRE_NO_AUTO_CAPTURE) == 0;
+
+ /* Capturing brackets must be counted so we can process escapes in a
+ Perlish way. If the number exceeds EXTRACT_BASIC_MAX we are going to need
+ an additional 3 bytes of memory per capturing bracket. */
- else if ((options & PCRE_NO_AUTO_CAPTURE) == 0)
+ if (capturing)
{
bracount++;
if (bracount > EXTRACT_BASIC_MAX) bracket_length += 3;
-/* $Cambridge: exim/src/src/pcre/pcre_config.c,v 1.1 2005/06/15 08:57:10 ph10 Exp $ */
+/* $Cambridge: exim/src/src/pcre/pcre_config.c,v 1.2 2005/08/08 10:22:14 ph10 Exp $ */
/*************************************************
* Perl-Compatible Regular Expressions *
-/* $Cambridge: exim/src/src/pcre/pcre_exec.c,v 1.1 2005/06/15 08:57:10 ph10 Exp $ */
+/* $Cambridge: exim/src/src/pcre/pcre_exec.c,v 1.2 2005/08/08 10:22:14 ph10 Exp $ */
/*************************************************
* Perl-Compatible Regular Expressions *
-/* $Cambridge: exim/src/src/pcre/pcre_fullinfo.c,v 1.1 2005/06/15 08:57:10 ph10 Exp $ */
+/* $Cambridge: exim/src/src/pcre/pcre_fullinfo.c,v 1.2 2005/08/08 10:22:14 ph10 Exp $ */
/*************************************************
* Perl-Compatible Regular Expressions *
-/* $Cambridge: exim/src/src/pcre/pcre_get.c,v 1.1 2005/06/15 08:57:10 ph10 Exp $ */
+/* $Cambridge: exim/src/src/pcre/pcre_get.c,v 1.2 2005/08/08 10:22:14 ph10 Exp $ */
/*************************************************
* Perl-Compatible Regular Expressions *
-/* $Cambridge: exim/src/src/pcre/pcre_globals.c,v 1.1 2005/06/15 08:57:10 ph10 Exp $ */
+/* $Cambridge: exim/src/src/pcre/pcre_globals.c,v 1.2 2005/08/08 10:22:14 ph10 Exp $ */
/*************************************************
* Perl-Compatible Regular Expressions *
-/* $Cambridge: exim/src/src/pcre/pcre_internal.h,v 1.1 2005/06/15 08:57:10 ph10 Exp $ */
+/* $Cambridge: exim/src/src/pcre/pcre_internal.h,v 1.2 2005/08/08 10:22:14 ph10 Exp $ */
/*************************************************
* Perl-Compatible Regular Expressions *
-/* $Cambridge: exim/src/src/pcre/pcre_maketables.c,v 1.1 2005/06/15 08:57:10 ph10 Exp $ */
+/* $Cambridge: exim/src/src/pcre/pcre_maketables.c,v 1.2 2005/08/08 10:22:14 ph10 Exp $ */
/*************************************************
* Perl-Compatible Regular Expressions *
-/* $Cambridge: exim/src/src/pcre/pcre_printint.c,v 1.1 2005/06/15 08:57:10 ph10 Exp $ */
+/* $Cambridge: exim/src/src/pcre/pcre_printint.c,v 1.2 2005/08/08 10:22:14 ph10 Exp $ */
/*************************************************
* Perl-Compatible Regular Expressions *
-/* $Cambridge: exim/src/src/pcre/pcre_study.c,v 1.1 2005/06/15 08:57:10 ph10 Exp $ */
+/* $Cambridge: exim/src/src/pcre/pcre_study.c,v 1.2 2005/08/08 10:22:14 ph10 Exp $ */
/*************************************************
* Perl-Compatible Regular Expressions *
-/* $Cambridge: exim/src/src/pcre/pcre_tables.c,v 1.1 2005/06/15 08:57:10 ph10 Exp $ */
+/* $Cambridge: exim/src/src/pcre/pcre_tables.c,v 1.2 2005/08/08 10:22:14 ph10 Exp $ */
/*************************************************
* Perl-Compatible Regular Expressions *
-/* $Cambridge: exim/src/src/pcre/pcre_try_flipped.c,v 1.1 2005/06/15 08:57:10 ph10 Exp $ */
+/* $Cambridge: exim/src/src/pcre/pcre_try_flipped.c,v 1.2 2005/08/08 10:22:14 ph10 Exp $ */
/*************************************************
* Perl-Compatible Regular Expressions *
-/* $Cambridge: exim/src/src/pcre/pcre_version.c,v 1.1 2005/06/15 08:57:10 ph10 Exp $ */
+/* $Cambridge: exim/src/src/pcre/pcre_version.c,v 1.2 2005/08/08 10:22:14 ph10 Exp $ */
/*************************************************
* Perl-Compatible Regular Expressions *
-/* $Cambridge: exim/src/src/pcre/pcretest.c,v 1.2 2005/06/15 08:57:10 ph10 Exp $ */
+/* $Cambridge: exim/src/src/pcre/pcretest.c,v 1.3 2005/08/08 10:22:14 ph10 Exp $ */
/*************************************************
* PCRE testing program *
#include "pcreposix.h"
#endif
-/* It is also possible, for the benefit of the version imported into Exim, to
-build pcretest without support for UTF8 (define NOUTF8), without the interface
+/* It is also possible, for the benefit of the version imported into Exim, to
+build pcretest without support for UTF8 (define NOUTF8), without the interface
to the DFA matcher (NODFA), and without the doublecheck of the old "info"
function (define NOINFOCHECK). */
while (length-- > 0)
{
-#if !defined NOUTF8
+#if !defined NOUTF8
if (use_utf8)
{
int rc = utf82ord(p, &c);
else if (strcmp(argv[op], "-t") == 0) timeit = 1;
else if (strcmp(argv[op], "-i") == 0) showinfo = 1;
else if (strcmp(argv[op], "-d") == 0) showinfo = debug = 1;
-#if !defined NODFA
+#if !defined NODFA
else if (strcmp(argv[op], "-dfa") == 0) all_use_dfa = 1;
#endif
else if (strcmp(argv[op], "-o") == 0 && argc > 2 &&
-/* $Cambridge: exim/src/src/pcre/ucp.h,v 1.1 2005/06/15 08:57:10 ph10 Exp $ */
+/* $Cambridge: exim/src/src/pcre/ucp.h,v 1.2 2005/08/08 10:22:14 ph10 Exp $ */
/*************************************************
* libucp - Unicode Property Table handler *