Blame README

Packit Service 02e2fd
README file for PCRE2 (Perl-compatible regular expression library)
Packit Service 02e2fd
------------------------------------------------------------------
Packit Service 02e2fd
Packit Service 02e2fd
PCRE2 is a re-working of the original PCRE library to provide an entirely new
Packit Service 02e2fd
API. The latest release of PCRE2 is always available in three alternative
Packit Service 02e2fd
formats from:
Packit Service 02e2fd
Packit Service 02e2fd
  ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre2-xxx.tar.gz
Packit Service 02e2fd
  ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre2-xxx.tar.bz2
Packit Service 02e2fd
  ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre2-xxx.zip
Packit Service 02e2fd
Packit Service 02e2fd
There is a mailing list for discussion about the development of PCRE (both the
Packit Service 02e2fd
original and new APIs) at pcre-dev@exim.org. You can access the archives and
Packit Service 02e2fd
subscribe or manage your subscription here:
Packit Service 02e2fd
Packit Service 02e2fd
   https://lists.exim.org/mailman/listinfo/pcre-dev
Packit Service 02e2fd
Packit Service 02e2fd
Please read the NEWS file if you are upgrading from a previous release. The
Packit Service 02e2fd
contents of this README file are:
Packit Service 02e2fd
Packit Service 02e2fd
  The PCRE2 APIs
Packit Service 02e2fd
  Documentation for PCRE2
Packit Service 02e2fd
  Contributions by users of PCRE2
Packit Service 02e2fd
  Building PCRE2 on non-Unix-like systems
Packit Service 02e2fd
  Building PCRE2 without using autotools
Packit Service 02e2fd
  Building PCRE2 using autotools
Packit Service 02e2fd
  Retrieving configuration information
Packit Service 02e2fd
  Shared libraries
Packit Service 02e2fd
  Cross-compiling using autotools
Packit Service 02e2fd
  Making new tarballs
Packit Service 02e2fd
  Testing PCRE2
Packit Service 02e2fd
  Character tables
Packit Service 02e2fd
  File manifest
Packit Service 02e2fd
Packit Service 02e2fd
Packit Service 02e2fd
The PCRE2 APIs
Packit Service 02e2fd
--------------
Packit Service 02e2fd
Packit Service 02e2fd
PCRE2 is written in C, and it has its own API. There are three sets of
Packit Service 02e2fd
functions, one for the 8-bit library, which processes strings of bytes, one for
Packit Service 02e2fd
the 16-bit library, which processes strings of 16-bit values, and one for the
Packit Service 02e2fd
32-bit library, which processes strings of 32-bit values. There are no C++
Packit Service 02e2fd
wrappers.
Packit Service 02e2fd
Packit Service 02e2fd
The distribution does contain a set of C wrapper functions for the 8-bit
Packit Service 02e2fd
library that are based on the POSIX regular expression API (see the pcre2posix
Packit Service 02e2fd
man page). These can be found in a library called libpcre2-posix. Note that
Packit Service 02e2fd
this just provides a POSIX calling interface to PCRE2; the regular expressions
Packit Service 02e2fd
themselves still follow Perl syntax and semantics. The POSIX API is restricted,
Packit Service 02e2fd
and does not give full access to all of PCRE2's facilities.
Packit Service 02e2fd
Packit Service 02e2fd
The header file for the POSIX-style functions is called pcre2posix.h. The
Packit Service 02e2fd
official POSIX name is regex.h, but I did not want to risk possible problems
Packit Service 02e2fd
with existing files of that name by distributing it that way. To use PCRE2 with
Packit Service 02e2fd
an existing program that uses the POSIX API, pcre2posix.h will have to be
Packit Service 02e2fd
renamed or pointed at by a link.
Packit Service 02e2fd
Packit Service 02e2fd
If you are using the POSIX interface to PCRE2 and there is already a POSIX
Packit Service 02e2fd
regex library installed on your system, as well as worrying about the regex.h
Packit Service 02e2fd
header file (as mentioned above), you must also take care when linking programs
Packit Service 02e2fd
to ensure that they link with PCRE2's libpcre2-posix library. Otherwise they
Packit Service 02e2fd
may pick up the POSIX functions of the same name from the other library.
Packit Service 02e2fd
Packit Service 02e2fd
One way of avoiding this confusion is to compile PCRE2 with the addition of
Packit Service 02e2fd
-Dregcomp=PCRE2regcomp (and similarly for the other POSIX functions) to the
Packit Service 02e2fd
compiler flags (CFLAGS if you are using "configure" -- see below). This has the
Packit Service 02e2fd
effect of renaming the functions so that the names no longer clash. Of course,
Packit Service 02e2fd
you have to do the same thing for your applications, or write them using the
Packit Service 02e2fd
new names.
Packit Service 02e2fd
Packit Service 02e2fd
Packit Service 02e2fd
Documentation for PCRE2
Packit Service 02e2fd
-----------------------
Packit Service 02e2fd
Packit Service 02e2fd
If you install PCRE2 in the normal way on a Unix-like system, you will end up
Packit Service 02e2fd
with a set of man pages whose names all start with "pcre2". The one that is
Packit Service 02e2fd
just called "pcre2" lists all the others. In addition to these man pages, the
Packit Service 02e2fd
PCRE2 documentation is supplied in two other forms:
Packit Service 02e2fd
Packit Service 02e2fd
  1. There are files called doc/pcre2.txt, doc/pcre2grep.txt, and
Packit Service 02e2fd
     doc/pcre2test.txt in the source distribution. The first of these is a
Packit Service 02e2fd
     concatenation of the text forms of all the section 3 man pages except the
Packit Service 02e2fd
     listing of pcre2demo.c and those that summarize individual functions. The
Packit Service 02e2fd
     other two are the text forms of the section 1 man pages for the pcre2grep
Packit Service 02e2fd
     and pcre2test commands. These text forms are provided for ease of scanning
Packit Service 02e2fd
     with text editors or similar tools. They are installed in
Packit Service 02e2fd
     <prefix>/share/doc/pcre2, where <prefix> is the installation prefix
Packit Service 02e2fd
     (defaulting to /usr/local).
Packit Service 02e2fd
Packit Service 02e2fd
  2. A set of files containing all the documentation in HTML form, hyperlinked
Packit Service 02e2fd
     in various ways, and rooted in a file called index.html, is distributed in
Packit Service 02e2fd
     doc/html and installed in <prefix>/share/doc/pcre2/html.
Packit Service 02e2fd
Packit Service 02e2fd
Packit Service 02e2fd
Building PCRE2 on non-Unix-like systems
Packit Service 02e2fd
---------------------------------------
Packit Service 02e2fd
Packit Service 02e2fd
For a non-Unix-like system, please read the file NON-AUTOTOOLS-BUILD, though if
Packit Service 02e2fd
your system supports the use of "configure" and "make" you may be able to build
Packit Service 02e2fd
PCRE2 using autotools in the same way as for many Unix-like systems.
Packit Service 02e2fd
Packit Service 02e2fd
PCRE2 can also be configured using CMake, which can be run in various ways
Packit Service 02e2fd
(command line, GUI, etc). This creates Makefiles, solution files, etc. The file
Packit Service 02e2fd
NON-AUTOTOOLS-BUILD has information about CMake.
Packit Service 02e2fd
Packit Service 02e2fd
PCRE2 has been compiled on many different operating systems. It should be
Packit Service 02e2fd
straightforward to build PCRE2 on any system that has a Standard C compiler and
Packit Service 02e2fd
library, because it uses only Standard C functions.
Packit Service 02e2fd
Packit Service 02e2fd
Packit Service 02e2fd
Building PCRE2 without using autotools
Packit Service 02e2fd
--------------------------------------
Packit Service 02e2fd
Packit Service 02e2fd
The use of autotools (in particular, libtool) is problematic in some
Packit Service 02e2fd
environments, even some that are Unix or Unix-like. See the NON-AUTOTOOLS-BUILD
Packit Service 02e2fd
file for ways of building PCRE2 without using autotools.
Packit Service 02e2fd
Packit Service 02e2fd
Packit Service 02e2fd
Building PCRE2 using autotools
Packit Service 02e2fd
------------------------------
Packit Service 02e2fd
Packit Service 02e2fd
The following instructions assume the use of the widely used "configure; make;
Packit Service 02e2fd
make install" (autotools) process.
Packit Service 02e2fd
Packit Service 02e2fd
To build PCRE2 on system that supports autotools, first run the "configure"
Packit Service 02e2fd
command from the PCRE2 distribution directory, with your current directory set
Packit Service 02e2fd
to the directory where you want the files to be created. This command is a
Packit Service 02e2fd
standard GNU "autoconf" configuration script, for which generic instructions
Packit Service 02e2fd
are supplied in the file INSTALL.
Packit Service 02e2fd
Packit Service 02e2fd
Most commonly, people build PCRE2 within its own distribution directory, and in
Packit Service 02e2fd
this case, on many systems, just running "./configure" is sufficient. However,
Packit Service 02e2fd
the usual methods of changing standard defaults are available. For example:
Packit Service 02e2fd
Packit Service 02e2fd
CFLAGS='-O2 -Wall' ./configure --prefix=/opt/local
Packit Service 02e2fd
Packit Service 02e2fd
This command specifies that the C compiler should be run with the flags '-O2
Packit Service 02e2fd
-Wall' instead of the default, and that "make install" should install PCRE2
Packit Service 02e2fd
under /opt/local instead of the default /usr/local.
Packit Service 02e2fd
Packit Service 02e2fd
If you want to build in a different directory, just run "configure" with that
Packit Service 02e2fd
directory as current. For example, suppose you have unpacked the PCRE2 source
Packit Service 02e2fd
into /source/pcre2/pcre2-xxx, but you want to build it in
Packit Service 02e2fd
/build/pcre2/pcre2-xxx:
Packit Service 02e2fd
Packit Service 02e2fd
cd /build/pcre2/pcre2-xxx
Packit Service 02e2fd
/source/pcre2/pcre2-xxx/configure
Packit Service 02e2fd
Packit Service 02e2fd
PCRE2 is written in C and is normally compiled as a C library. However, it is
Packit Service 02e2fd
possible to build it as a C++ library, though the provided building apparatus
Packit Service 02e2fd
does not have any features to support this.
Packit Service 02e2fd
Packit Service 02e2fd
There are some optional features that can be included or omitted from the PCRE2
Packit Service 02e2fd
library. They are also documented in the pcre2build man page.
Packit Service 02e2fd
Packit Service 02e2fd
. By default, both shared and static libraries are built. You can change this
Packit Service 02e2fd
  by adding one of these options to the "configure" command:
Packit Service 02e2fd
Packit Service 02e2fd
  --disable-shared
Packit Service 02e2fd
  --disable-static
Packit Service 02e2fd
Packit Service 02e2fd
  (See also "Shared libraries on Unix-like systems" below.)
Packit Service 02e2fd
Packit Service 02e2fd
. By default, only the 8-bit library is built. If you add --enable-pcre2-16 to
Packit Service 02e2fd
  the "configure" command, the 16-bit library is also built. If you add
Packit Service 02e2fd
  --enable-pcre2-32 to the "configure" command, the 32-bit library is also
Packit Service 02e2fd
  built. If you want only the 16-bit or 32-bit library, use --disable-pcre2-8
Packit Service 02e2fd
  to disable building the 8-bit library.
Packit Service 02e2fd
Packit Service 02e2fd
. If you want to include support for just-in-time (JIT) compiling, which can
Packit Service 02e2fd
  give large performance improvements on certain platforms, add --enable-jit to
Packit Service 02e2fd
  the "configure" command. This support is available only for certain hardware
Packit Service 02e2fd
  architectures. If you try to enable it on an unsupported architecture, there
Packit Service 02e2fd
  will be a compile time error. If in doubt, use --enable-jit=auto, which
Packit Service 02e2fd
  enables JIT only if the current hardware is supported.
Packit Service 02e2fd
Packit Service 02e2fd
. If you are enabling JIT under SELinux you may also want to add
Packit Service 02e2fd
  --enable-jit-sealloc, which enables the use of an execmem allocator in JIT
Packit Service 02e2fd
  that is compatible with SELinux. This has no effect if JIT is not enabled.
Packit Service 02e2fd
Packit Service 02e2fd
. If you do not want to make use of the default support for UTF-8 Unicode
Packit Service 02e2fd
  character strings in the 8-bit library, UTF-16 Unicode character strings in
Packit Service 02e2fd
  the 16-bit library, or UTF-32 Unicode character strings in the 32-bit
Packit Service 02e2fd
  library, you can add --disable-unicode to the "configure" command. This
Packit Service 02e2fd
  reduces the size of the libraries. It is not possible to configure one
Packit Service 02e2fd
  library with Unicode support, and another without, in the same configuration.
Packit Service 02e2fd
  It is also not possible to use --enable-ebcdic (see below) with Unicode
Packit Service 02e2fd
  support, so if this option is set, you must also use --disable-unicode.
Packit Service 02e2fd
Packit Service 02e2fd
  When Unicode support is available, the use of a UTF encoding still has to be
Packit Service 02e2fd
  enabled by setting the PCRE2_UTF option at run time or starting a pattern
Packit Service 02e2fd
  with (*UTF). When PCRE2 is compiled with Unicode support, its input can only
Packit Service 02e2fd
  either be ASCII or UTF-8/16/32, even when running on EBCDIC platforms.
Packit Service 02e2fd
Packit Service 02e2fd
  As well as supporting UTF strings, Unicode support includes support for the
Packit Service 02e2fd
  \P, \p, and \X sequences that recognize Unicode character properties.
Packit Service 02e2fd
  However, only the basic two-letter properties such as Lu are supported.
Packit Service 02e2fd
  Escape sequences such as \d and \w in patterns do not by default make use of
Packit Service 02e2fd
  Unicode properties, but can be made to do so by setting the PCRE2_UCP option
Packit Service 02e2fd
  or starting a pattern with (*UCP).
Packit Service 02e2fd
Packit Service 02e2fd
. You can build PCRE2 to recognize either CR or LF or the sequence CRLF, or any
Packit Service 02e2fd
  of the preceding, or any of the Unicode newline sequences, or the NUL (zero)
Packit Service 02e2fd
  character as indicating the end of a line. Whatever you specify at build time
Packit Service 02e2fd
  is the default; the caller of PCRE2 can change the selection at run time. The
Packit Service 02e2fd
  default newline indicator is a single LF character (the Unix standard). You
Packit Service 02e2fd
  can specify the default newline indicator by adding --enable-newline-is-cr,
Packit Service 02e2fd
  --enable-newline-is-lf, --enable-newline-is-crlf,
Packit Service 02e2fd
  --enable-newline-is-anycrlf, --enable-newline-is-any, or
Packit Service 02e2fd
  --enable-newline-is-nul to the "configure" command, respectively.
Packit Service 02e2fd
Packit Service 02e2fd
. By default, the sequence \R in a pattern matches any Unicode line ending
Packit Service 02e2fd
  sequence. This is independent of the option specifying what PCRE2 considers
Packit Service 02e2fd
  to be the end of a line (see above). However, the caller of PCRE2 can
Packit Service 02e2fd
  restrict \R to match only CR, LF, or CRLF. You can make this the default by
Packit Service 02e2fd
  adding --enable-bsr-anycrlf to the "configure" command (bsr = "backslash R").
Packit Service 02e2fd
Packit Service 02e2fd
. In a pattern, the escape sequence \C matches a single code unit, even in a
Packit Service 02e2fd
  UTF mode. This can be dangerous because it breaks up multi-code-unit
Packit Service 02e2fd
  characters. You can build PCRE2 with the use of \C permanently locked out by
Packit Service 02e2fd
  adding --enable-never-backslash-C (note the upper case C) to the "configure"
Packit Service 02e2fd
  command. When \C is allowed by the library, individual applications can lock
Packit Service 02e2fd
  it out by calling pcre2_compile() with the PCRE2_NEVER_BACKSLASH_C option.
Packit Service 02e2fd
Packit Service 02e2fd
. PCRE2 has a counter that limits the depth of nesting of parentheses in a
Packit Service 02e2fd
  pattern. This limits the amount of system stack that a pattern uses when it
Packit Service 02e2fd
  is compiled. The default is 250, but you can change it by setting, for
Packit Service 02e2fd
  example,
Packit Service 02e2fd
Packit Service 02e2fd
  --with-parens-nest-limit=500
Packit Service 02e2fd
Packit Service 02e2fd
. PCRE2 has a counter that can be set to limit the amount of computing resource
Packit Service 02e2fd
  it uses when matching a pattern. If the limit is exceeded during a match, the
Packit Service 02e2fd
  match fails. The default is ten million. You can change the default by
Packit Service 02e2fd
  setting, for example,
Packit Service 02e2fd
Packit Service 02e2fd
  --with-match-limit=500000
Packit Service 02e2fd
Packit Service 02e2fd
  on the "configure" command. This is just the default; individual calls to
Packit Service 02e2fd
  pcre2_match() or pcre2_dfa_match() can supply their own value. There is more
Packit Service 02e2fd
  discussion in the pcre2api man page (search for pcre2_set_match_limit).
Packit Service 02e2fd
Packit Service 02e2fd
. There is a separate counter that limits the depth of nested backtracking
Packit Service 02e2fd
  (pcre2_match()) or nested function calls (pcre2_dfa_match()) during a
Packit Service 02e2fd
  matching process, which indirectly limits the amount of heap memory that is
Packit Service 02e2fd
  used, and in the case of pcre2_dfa_match() the amount of stack as well. This
Packit Service 02e2fd
  counter also has a default of ten million, which is essentially "unlimited".
Packit Service 02e2fd
  You can change the default by setting, for example,
Packit Service 02e2fd
Packit Service 02e2fd
  --with-match-limit-depth=5000
Packit Service 02e2fd
Packit Service 02e2fd
  There is more discussion in the pcre2api man page (search for
Packit Service 02e2fd
  pcre2_set_depth_limit).
Packit Service 02e2fd
Packit Service 02e2fd
. You can also set an explicit limit on the amount of heap memory used by
Packit Service 02e2fd
  the pcre2_match() and pcre2_dfa_match() interpreters:
Packit Service 02e2fd
Packit Service 02e2fd
  --with-heap-limit=500
Packit Service 02e2fd
Packit Service 02e2fd
  The units are kibibytes (units of 1024 bytes). This limit does not apply when
Packit Service 02e2fd
  the JIT optimization (which has its own memory control features) is used.
Packit Service 02e2fd
  There is more discussion on the pcre2api man page (search for
Packit Service 02e2fd
  pcre2_set_heap_limit).
Packit Service 02e2fd
Packit Service 02e2fd
. In the 8-bit library, the default maximum compiled pattern size is around
Packit Service 02e2fd
  64 kibibytes. You can increase this by adding --with-link-size=3 to the
Packit Service 02e2fd
  "configure" command. PCRE2 then uses three bytes instead of two for offsets
Packit Service 02e2fd
  to different parts of the compiled pattern. In the 16-bit library,
Packit Service 02e2fd
  --with-link-size=3 is the same as --with-link-size=4, which (in both
Packit Service 02e2fd
  libraries) uses four-byte offsets. Increasing the internal link size reduces
Packit Service 02e2fd
  performance in the 8-bit and 16-bit libraries. In the 32-bit library, the
Packit Service 02e2fd
  link size setting is ignored, as 4-byte offsets are always used.
Packit Service 02e2fd
Packit Service 02e2fd
. For speed, PCRE2 uses four tables for manipulating and identifying characters
Packit Service 02e2fd
  whose code point values are less than 256. By default, it uses a set of
Packit Service 02e2fd
  tables for ASCII encoding that is part of the distribution. If you specify
Packit Service 02e2fd
Packit Service 02e2fd
  --enable-rebuild-chartables
Packit Service 02e2fd
Packit Service 02e2fd
  a program called dftables is compiled and run in the default C locale when
Packit Service 02e2fd
  you obey "make". It builds a source file called pcre2_chartables.c. If you do
Packit Service 02e2fd
  not specify this option, pcre2_chartables.c is created as a copy of
Packit Service 02e2fd
  pcre2_chartables.c.dist. See "Character tables" below for further
Packit Service 02e2fd
  information.
Packit Service 02e2fd
Packit Service 02e2fd
. It is possible to compile PCRE2 for use on systems that use EBCDIC as their
Packit Service 02e2fd
  character code (as opposed to ASCII/Unicode) by specifying
Packit Service 02e2fd
Packit Service 02e2fd
  --enable-ebcdic --disable-unicode
Packit Service 02e2fd
Packit Service 02e2fd
  This automatically implies --enable-rebuild-chartables (see above). However,
Packit Service 02e2fd
  when PCRE2 is built this way, it always operates in EBCDIC. It cannot support
Packit Service 02e2fd
  both EBCDIC and UTF-8/16/32. There is a second option, --enable-ebcdic-nl25,
Packit Service 02e2fd
  which specifies that the code value for the EBCDIC NL character is 0x25
Packit Service 02e2fd
  instead of the default 0x15.
Packit Service 02e2fd
Packit Service 02e2fd
. If you specify --enable-debug, additional debugging code is included in the
Packit Service 02e2fd
  build. This option is intended for use by the PCRE2 maintainers.
Packit Service 02e2fd
Packit Service 02e2fd
. In environments where valgrind is installed, if you specify
Packit Service 02e2fd
Packit Service 02e2fd
  --enable-valgrind
Packit Service 02e2fd
Packit Service 02e2fd
  PCRE2 will use valgrind annotations to mark certain memory regions as
Packit Service 02e2fd
  unaddressable. This allows it to detect invalid memory accesses, and is
Packit Service 02e2fd
  mostly useful for debugging PCRE2 itself.
Packit Service 02e2fd
Packit Service 02e2fd
. In environments where the gcc compiler is used and lcov version 1.6 or above
Packit Service 02e2fd
  is installed, if you specify
Packit Service 02e2fd
Packit Service 02e2fd
  --enable-coverage
Packit Service 02e2fd
Packit Service 02e2fd
  the build process implements a code coverage report for the test suite. The
Packit Service 02e2fd
  report is generated by running "make coverage". If ccache is installed on
Packit Service 02e2fd
  your system, it must be disabled when building PCRE2 for coverage reporting.
Packit Service 02e2fd
  You can do this by setting the environment variable CCACHE_DISABLE=1 before
Packit Service 02e2fd
  running "make" to build PCRE2. There is more information about coverage
Packit Service 02e2fd
  reporting in the "pcre2build" documentation.
Packit Service 02e2fd
Packit Service 02e2fd
. When JIT support is enabled, pcre2grep automatically makes use of it, unless
Packit Service 02e2fd
  you add --disable-pcre2grep-jit to the "configure" command.
Packit Service 02e2fd
Packit Service 02e2fd
. There is support for calling external programs during matching in the
Packit Service 02e2fd
  pcre2grep command, using PCRE2's callout facility with string arguments. This
Packit Service 02e2fd
  support can be disabled by adding --disable-pcre2grep-callout to the
Packit Service 02e2fd
  "configure" command.
Packit Service 02e2fd
Packit Service 02e2fd
. The pcre2grep program currently supports only 8-bit data files, and so
Packit Service 02e2fd
  requires the 8-bit PCRE2 library. It is possible to compile pcre2grep to use
Packit Service 02e2fd
  libz and/or libbz2, in order to read .gz and .bz2 files (respectively), by
Packit Service 02e2fd
  specifying one or both of
Packit Service 02e2fd
Packit Service 02e2fd
  --enable-pcre2grep-libz
Packit Service 02e2fd
  --enable-pcre2grep-libbz2
Packit Service 02e2fd
Packit Service 02e2fd
  Of course, the relevant libraries must be installed on your system.
Packit Service 02e2fd
Packit Service 02e2fd
. The default starting size (in bytes) of the internal buffer used by pcre2grep
Packit Service 02e2fd
  can be set by, for example:
Packit Service 02e2fd
Packit Service 02e2fd
  --with-pcre2grep-bufsize=51200
Packit Service 02e2fd
Packit Service 02e2fd
  The value must be a plain integer. The default is 20480. The amount of memory
Packit Service 02e2fd
  used by pcre2grep is actually three times this number, to allow for "before"
Packit Service 02e2fd
  and "after" lines. If very long lines are encountered, the buffer is
Packit Service 02e2fd
  automatically enlarged, up to a fixed maximum size.
Packit Service 02e2fd
Packit Service 02e2fd
. The default maximum size of pcre2grep's internal buffer can be set by, for
Packit Service 02e2fd
  example:
Packit Service 02e2fd
Packit Service 02e2fd
  --with-pcre2grep-max-bufsize=2097152
Packit Service 02e2fd
Packit Service 02e2fd
  The default is either 1048576 or the value of --with-pcre2grep-bufsize,
Packit Service 02e2fd
  whichever is the larger.
Packit Service 02e2fd
Packit Service 02e2fd
. It is possible to compile pcre2test so that it links with the libreadline
Packit Service 02e2fd
  or libedit libraries, by specifying, respectively,
Packit Service 02e2fd
Packit Service 02e2fd
  --enable-pcre2test-libreadline or --enable-pcre2test-libedit
Packit Service 02e2fd
Packit Service 02e2fd
  If this is done, when pcre2test's input is from a terminal, it reads it using
Packit Service 02e2fd
  the readline() function. This provides line-editing and history facilities.
Packit Service 02e2fd
  Note that libreadline is GPL-licenced, so if you distribute a binary of
Packit Service 02e2fd
  pcre2test linked in this way, there may be licensing issues. These can be
Packit Service 02e2fd
  avoided by linking with libedit (which has a BSD licence) instead.
Packit Service 02e2fd
Packit Service 02e2fd
  Enabling libreadline causes the -lreadline option to be added to the
Packit Service 02e2fd
  pcre2test build. In many operating environments with a sytem-installed
Packit Service 02e2fd
  readline library this is sufficient. However, in some environments (e.g. if
Packit Service 02e2fd
  an unmodified distribution version of readline is in use), it may be
Packit Service 02e2fd
  necessary to specify something like LIBS="-lncurses" as well. This is
Packit Service 02e2fd
  because, to quote the readline INSTALL, "Readline uses the termcap functions,
Packit Service 02e2fd
  but does not link with the termcap or curses library itself, allowing
Packit Service 02e2fd
  applications which link with readline the to choose an appropriate library."
Packit Service 02e2fd
  If you get error messages about missing functions tgetstr, tgetent, tputs,
Packit Service 02e2fd
  tgetflag, or tgoto, this is the problem, and linking with the ncurses library
Packit Service 02e2fd
  should fix it.
Packit Service 02e2fd
Packit Service 02e2fd
. There is a special option called --enable-fuzz-support for use by people who
Packit Service 02e2fd
  want to run fuzzing tests on PCRE2. At present this applies only to the 8-bit
Packit Service 02e2fd
  library. If set, it causes an extra library called libpcre2-fuzzsupport.a to
Packit Service 02e2fd
  be built, but not installed. This contains a single function called
Packit Service 02e2fd
  LLVMFuzzerTestOneInput() whose arguments are a pointer to a string and the
Packit Service 02e2fd
  length of the string. When called, this function tries to compile the string
Packit Service 02e2fd
  as a pattern, and if that succeeds, to match it. This is done both with no
Packit Service 02e2fd
  options and with some random options bits that are generated from the string.
Packit Service 02e2fd
  Setting --enable-fuzz-support also causes a binary called pcre2fuzzcheck to
Packit Service 02e2fd
  be created. This is normally run under valgrind or used when PCRE2 is
Packit Service 02e2fd
  compiled with address sanitizing enabled. It calls the fuzzing function and
Packit Service 02e2fd
  outputs information about it is doing. The input strings are specified by
Packit Service 02e2fd
  arguments: if an argument starts with "=" the rest of it is a literal input
Packit Service 02e2fd
  string. Otherwise, it is assumed to be a file name, and the contents of the
Packit Service 02e2fd
  file are the test string.
Packit Service 02e2fd
Packit Service 02e2fd
. Releases before 10.30 could be compiled with --disable-stack-for-recursion,
Packit Service 02e2fd
  which caused pcre2_match() to use individual blocks on the heap for
Packit Service 02e2fd
  backtracking instead of recursive function calls (which use the stack). This
Packit Service 02e2fd
  is now obsolete since pcre2_match() was refactored always to use the heap (in
Packit Service 02e2fd
  a much more efficient way than before). This option is retained for backwards
Packit Service 02e2fd
  compatibility, but has no effect other than to output a warning.
Packit Service 02e2fd
Packit Service 02e2fd
The "configure" script builds the following files for the basic C library:
Packit Service 02e2fd
Packit Service 02e2fd
. Makefile             the makefile that builds the library
Packit Service 02e2fd
. src/config.h         build-time configuration options for the library
Packit Service 02e2fd
. src/pcre2.h          the public PCRE2 header file
Packit Service 02e2fd
. pcre2-config          script that shows the building settings such as CFLAGS
Packit Service 02e2fd
                         that were set for "configure"
Packit Service 02e2fd
. libpcre2-8.pc        )
Packit Service 02e2fd
. libpcre2-16.pc       ) data for the pkg-config command
Packit Service 02e2fd
. libpcre2-32.pc       )
Packit Service 02e2fd
. libpcre2-posix.pc    )
Packit Service 02e2fd
. libtool              script that builds shared and/or static libraries
Packit Service 02e2fd
Packit Service 02e2fd
Versions of config.h and pcre2.h are distributed in the src directory of PCRE2
Packit Service 02e2fd
tarballs under the names config.h.generic and pcre2.h.generic. These are
Packit Service 02e2fd
provided for those who have to build PCRE2 without using "configure" or CMake.
Packit Service 02e2fd
If you use "configure" or CMake, the .generic versions are not used.
Packit Service 02e2fd
Packit Service 02e2fd
The "configure" script also creates config.status, which is an executable
Packit Service 02e2fd
script that can be run to recreate the configuration, and config.log, which
Packit Service 02e2fd
contains compiler output from tests that "configure" runs.
Packit Service 02e2fd
Packit Service 02e2fd
Once "configure" has run, you can run "make". This builds whichever of the
Packit Service 02e2fd
libraries libpcre2-8, libpcre2-16 and libpcre2-32 are configured, and a test
Packit Service 02e2fd
program called pcre2test. If you enabled JIT support with --enable-jit, another
Packit Service 02e2fd
test program called pcre2_jit_test is built as well. If the 8-bit library is
Packit Service 02e2fd
built, libpcre2-posix and the pcre2grep command are also built. Running
Packit Service 02e2fd
"make" with the -j option may speed up compilation on multiprocessor systems.
Packit Service 02e2fd
Packit Service 02e2fd
The command "make check" runs all the appropriate tests. Details of the PCRE2
Packit Service 02e2fd
tests are given below in a separate section of this document. The -j option of
Packit Service 02e2fd
"make" can also be used when running the tests.
Packit Service 02e2fd
Packit Service 02e2fd
You can use "make install" to install PCRE2 into live directories on your
Packit Service 02e2fd
system. The following are installed (file names are all relative to the
Packit Service 02e2fd
<prefix> that is set when "configure" is run):
Packit Service 02e2fd
Packit Service 02e2fd
  Commands (bin):
Packit Service 02e2fd
    pcre2test
Packit Service 02e2fd
    pcre2grep (if 8-bit support is enabled)
Packit Service 02e2fd
    pcre2-config
Packit Service 02e2fd
Packit Service 02e2fd
  Libraries (lib):
Packit Service 02e2fd
    libpcre2-8      (if 8-bit support is enabled)
Packit Service 02e2fd
    libpcre2-16     (if 16-bit support is enabled)
Packit Service 02e2fd
    libpcre2-32     (if 32-bit support is enabled)
Packit Service 02e2fd
    libpcre2-posix  (if 8-bit support is enabled)
Packit Service 02e2fd
Packit Service 02e2fd
  Configuration information (lib/pkgconfig):
Packit Service 02e2fd
    libpcre2-8.pc
Packit Service 02e2fd
    libpcre2-16.pc
Packit Service 02e2fd
    libpcre2-32.pc
Packit Service 02e2fd
    libpcre2-posix.pc
Packit Service 02e2fd
Packit Service 02e2fd
  Header files (include):
Packit Service 02e2fd
    pcre2.h
Packit Service 02e2fd
    pcre2posix.h
Packit Service 02e2fd
Packit Service 02e2fd
  Man pages (share/man/man{1,3}):
Packit Service 02e2fd
    pcre2grep.1
Packit Service 02e2fd
    pcre2test.1
Packit Service 02e2fd
    pcre2-config.1
Packit Service 02e2fd
    pcre2.3
Packit Service 02e2fd
    pcre2*.3 (lots more pages, all starting "pcre2")
Packit Service 02e2fd
Packit Service 02e2fd
  HTML documentation (share/doc/pcre2/html):
Packit Service 02e2fd
    index.html
Packit Service 02e2fd
    *.html (lots more pages, hyperlinked from index.html)
Packit Service 02e2fd
Packit Service 02e2fd
  Text file documentation (share/doc/pcre2):
Packit Service 02e2fd
    AUTHORS
Packit Service 02e2fd
    COPYING
Packit Service 02e2fd
    ChangeLog
Packit Service 02e2fd
    LICENCE
Packit Service 02e2fd
    NEWS
Packit Service 02e2fd
    README
Packit Service 02e2fd
    pcre2.txt         (a concatenation of the man(3) pages)
Packit Service 02e2fd
    pcre2test.txt     the pcre2test man page
Packit Service 02e2fd
    pcre2grep.txt     the pcre2grep man page
Packit Service 02e2fd
    pcre2-config.txt  the pcre2-config man page
Packit Service 02e2fd
Packit Service 02e2fd
If you want to remove PCRE2 from your system, you can run "make uninstall".
Packit Service 02e2fd
This removes all the files that "make install" installed. However, it does not
Packit Service 02e2fd
remove any directories, because these are often shared with other programs.
Packit Service 02e2fd
Packit Service 02e2fd
Packit Service 02e2fd
Retrieving configuration information
Packit Service 02e2fd
------------------------------------
Packit Service 02e2fd
Packit Service 02e2fd
Running "make install" installs the command pcre2-config, which can be used to
Packit Service 02e2fd
recall information about the PCRE2 configuration and installation. For example:
Packit Service 02e2fd
Packit Service 02e2fd
  pcre2-config --version
Packit Service 02e2fd
Packit Service 02e2fd
prints the version number, and
Packit Service 02e2fd
Packit Service 02e2fd
  pcre2-config --libs8
Packit Service 02e2fd
Packit Service 02e2fd
outputs information about where the 8-bit library is installed. This command
Packit Service 02e2fd
can be included in makefiles for programs that use PCRE2, saving the programmer
Packit Service 02e2fd
from having to remember too many details. Run pcre2-config with no arguments to
Packit Service 02e2fd
obtain a list of possible arguments.
Packit Service 02e2fd
Packit Service 02e2fd
The pkg-config command is another system for saving and retrieving information
Packit Service 02e2fd
about installed libraries. Instead of separate commands for each library, a
Packit Service 02e2fd
single command is used. For example:
Packit Service 02e2fd
Packit Service 02e2fd
  pkg-config --libs libpcre2-16
Packit Service 02e2fd
Packit Service 02e2fd
The data is held in *.pc files that are installed in a directory called
Packit Service 02e2fd
<prefix>/lib/pkgconfig.
Packit Service 02e2fd
Packit Service 02e2fd
Packit Service 02e2fd
Shared libraries
Packit Service 02e2fd
----------------
Packit Service 02e2fd
Packit Service 02e2fd
The default distribution builds PCRE2 as shared libraries and static libraries,
Packit Service 02e2fd
as long as the operating system supports shared libraries. Shared library
Packit Service 02e2fd
support relies on the "libtool" script which is built as part of the
Packit Service 02e2fd
"configure" process.
Packit Service 02e2fd
Packit Service 02e2fd
The libtool script is used to compile and link both shared and static
Packit Service 02e2fd
libraries. They are placed in a subdirectory called .libs when they are newly
Packit Service 02e2fd
built. The programs pcre2test and pcre2grep are built to use these uninstalled
Packit Service 02e2fd
libraries (by means of wrapper scripts in the case of shared libraries). When
Packit Service 02e2fd
you use "make install" to install shared libraries, pcre2grep and pcre2test are
Packit Service 02e2fd
automatically re-built to use the newly installed shared libraries before being
Packit Service 02e2fd
installed themselves. However, the versions left in the build directory still
Packit Service 02e2fd
use the uninstalled libraries.
Packit Service 02e2fd
Packit Service 02e2fd
To build PCRE2 using static libraries only you must use --disable-shared when
Packit Service 02e2fd
configuring it. For example:
Packit Service 02e2fd
Packit Service 02e2fd
./configure --prefix=/usr/gnu --disable-shared
Packit Service 02e2fd
Packit Service 02e2fd
Then run "make" in the usual way. Similarly, you can use --disable-static to
Packit Service 02e2fd
build only shared libraries.
Packit Service 02e2fd
Packit Service 02e2fd
Packit Service 02e2fd
Cross-compiling using autotools
Packit Service 02e2fd
-------------------------------
Packit Service 02e2fd
Packit Service 02e2fd
You can specify CC and CFLAGS in the normal way to the "configure" command, in
Packit Service 02e2fd
order to cross-compile PCRE2 for some other host. However, you should NOT
Packit Service 02e2fd
specify --enable-rebuild-chartables, because if you do, the dftables.c source
Packit Service 02e2fd
file is compiled and run on the local host, in order to generate the inbuilt
Packit Service 02e2fd
character tables (the pcre2_chartables.c file). This will probably not work,
Packit Service 02e2fd
because dftables.c needs to be compiled with the local compiler, not the cross
Packit Service 02e2fd
compiler.
Packit Service 02e2fd
Packit Service 02e2fd
When --enable-rebuild-chartables is not specified, pcre2_chartables.c is
Packit Service 02e2fd
created by making a copy of pcre2_chartables.c.dist, which is a default set of
Packit Service 02e2fd
tables that assumes ASCII code. Cross-compiling with the default tables should
Packit Service 02e2fd
not be a problem.
Packit Service 02e2fd
Packit Service 02e2fd
If you need to modify the character tables when cross-compiling, you should
Packit Service 02e2fd
move pcre2_chartables.c.dist out of the way, then compile dftables.c by hand
Packit Service 02e2fd
and run it on the local host to make a new version of pcre2_chartables.c.dist.
Packit Service 02e2fd
Then when you cross-compile PCRE2 this new version of the tables will be used.
Packit Service 02e2fd
Packit Service 02e2fd
Packit Service 02e2fd
Making new tarballs
Packit Service 02e2fd
-------------------
Packit Service 02e2fd
Packit Service 02e2fd
The command "make dist" creates three PCRE2 tarballs, in tar.gz, tar.bz2, and
Packit Service 02e2fd
zip formats. The command "make distcheck" does the same, but then does a trial
Packit Service 02e2fd
build of the new distribution to ensure that it works.
Packit Service 02e2fd
Packit Service 02e2fd
If you have modified any of the man page sources in the doc directory, you
Packit Service 02e2fd
should first run the PrepareRelease script before making a distribution. This
Packit Service 02e2fd
script creates the .txt and HTML forms of the documentation from the man pages.
Packit Service 02e2fd
Packit Service 02e2fd
Packit Service 02e2fd
Testing PCRE2
Packit Service 02e2fd
-------------
Packit Service 02e2fd
Packit Service 02e2fd
To test the basic PCRE2 library on a Unix-like system, run the RunTest script.
Packit Service 02e2fd
There is another script called RunGrepTest that tests the pcre2grep command.
Packit Service 02e2fd
When JIT support is enabled, a third test program called pcre2_jit_test is
Packit Service 02e2fd
built. Both the scripts and all the program tests are run if you obey "make
Packit Service 02e2fd
check". For other environments, see the instructions in NON-AUTOTOOLS-BUILD.
Packit Service 02e2fd
Packit Service 02e2fd
The RunTest script runs the pcre2test test program (which is documented in its
Packit Service 02e2fd
own man page) on each of the relevant testinput files in the testdata
Packit Service 02e2fd
directory, and compares the output with the contents of the corresponding
Packit Service 02e2fd
testoutput files. RunTest uses a file called testtry to hold the main output
Packit Service 02e2fd
from pcre2test. Other files whose names begin with "test" are used as working
Packit Service 02e2fd
files in some tests.
Packit Service 02e2fd
Packit Service 02e2fd
Some tests are relevant only when certain build-time options were selected. For
Packit Service 02e2fd
example, the tests for UTF-8/16/32 features are run only when Unicode support
Packit Service 02e2fd
is available. RunTest outputs a comment when it skips a test.
Packit Service 02e2fd
Packit Service 02e2fd
Many (but not all) of the tests that are not skipped are run twice if JIT
Packit Service 02e2fd
support is available. On the second run, JIT compilation is forced. This
Packit Service 02e2fd
testing can be suppressed by putting "nojit" on the RunTest command line.
Packit Service 02e2fd
Packit Service 02e2fd
The entire set of tests is run once for each of the 8-bit, 16-bit and 32-bit
Packit Service 02e2fd
libraries that are enabled. If you want to run just one set of tests, call
Packit Service 02e2fd
RunTest with either the -8, -16 or -32 option.
Packit Service 02e2fd
Packit Service 02e2fd
If valgrind is installed, you can run the tests under it by putting "valgrind"
Packit Service 02e2fd
on the RunTest command line. To run pcre2test on just one or more specific test
Packit Service 02e2fd
files, give their numbers as arguments to RunTest, for example:
Packit Service 02e2fd
Packit Service 02e2fd
  RunTest 2 7 11
Packit Service 02e2fd
Packit Service 02e2fd
You can also specify ranges of tests such as 3-6 or 3- (meaning 3 to the
Packit Service 02e2fd
end), or a number preceded by ~ to exclude a test. For example:
Packit Service 02e2fd
Packit Service 02e2fd
  Runtest 3-15 ~10
Packit Service 02e2fd
Packit Service 02e2fd
This runs tests 3 to 15, excluding test 10, and just ~13 runs all the tests
Packit Service 02e2fd
except test 13. Whatever order the arguments are in, the tests are always run
Packit Service 02e2fd
in numerical order.
Packit Service 02e2fd
Packit Service 02e2fd
You can also call RunTest with the single argument "list" to cause it to output
Packit Service 02e2fd
a list of tests.
Packit Service 02e2fd
Packit Service 02e2fd
The test sequence starts with "test 0", which is a special test that has no
Packit Service 02e2fd
input file, and whose output is not checked. This is because it will be
Packit Service 02e2fd
different on different hardware and with different configurations. The test
Packit Service 02e2fd
exists in order to exercise some of pcre2test's code that would not otherwise
Packit Service 02e2fd
be run.
Packit Service 02e2fd
Packit Service 02e2fd
Tests 1 and 2 can always be run, as they expect only plain text strings (not
Packit Service 02e2fd
UTF) and make no use of Unicode properties. The first test file can be fed
Packit Service 02e2fd
directly into the perltest.sh script to check that Perl gives the same results.
Packit Service 02e2fd
The only difference you should see is in the first few lines, where the Perl
Packit Service 02e2fd
version is given instead of the PCRE2 version. The second set of tests check
Packit Service 02e2fd
auxiliary functions, error detection, and run-time flags that are specific to
Packit Service 02e2fd
PCRE2. It also uses the debugging flags to check some of the internals of
Packit Service 02e2fd
pcre2_compile().
Packit Service 02e2fd
Packit Service 02e2fd
If you build PCRE2 with a locale setting that is not the standard C locale, the
Packit Service 02e2fd
character tables may be different (see next paragraph). In some cases, this may
Packit Service 02e2fd
cause failures in the second set of tests. For example, in a locale where the
Packit Service 02e2fd
isprint() function yields TRUE for characters in the range 128-255, the use of
Packit Service 02e2fd
[:isascii:] inside a character class defines a different set of characters, and
Packit Service 02e2fd
this shows up in this test as a difference in the compiled code, which is being
Packit Service 02e2fd
listed for checking. For example, where the comparison test output contains
Packit Service 02e2fd
[\x00-\x7f] the test might contain [\x00-\xff], and similarly in some other
Packit Service 02e2fd
cases. This is not a bug in PCRE2.
Packit Service 02e2fd
Packit Service 02e2fd
Test 3 checks pcre2_maketables(), the facility for building a set of character
Packit Service 02e2fd
tables for a specific locale and using them instead of the default tables. The
Packit Service 02e2fd
script uses the "locale" command to check for the availability of the "fr_FR",
Packit Service 02e2fd
"french", or "fr" locale, and uses the first one that it finds. If the "locale"
Packit Service 02e2fd
command fails, or if its output doesn't include "fr_FR", "french", or "fr" in
Packit Service 02e2fd
the list of available locales, the third test cannot be run, and a comment is
Packit Service 02e2fd
output to say why. If running this test produces an error like this:
Packit Service 02e2fd
Packit Service 02e2fd
  ** Failed to set locale "fr_FR"
Packit Service 02e2fd
Packit Service 02e2fd
it means that the given locale is not available on your system, despite being
Packit Service 02e2fd
listed by "locale". This does not mean that PCRE2 is broken. There are three
Packit Service 02e2fd
alternative output files for the third test, because three different versions
Packit Service 02e2fd
of the French locale have been encountered. The test passes if its output
Packit Service 02e2fd
matches any one of them.
Packit Service 02e2fd
Packit Service 02e2fd
Tests 4 and 5 check UTF and Unicode property support, test 4 being compatible
Packit Service 02e2fd
with the perltest.sh script, and test 5 checking PCRE2-specific things.
Packit Service 02e2fd
Packit Service 02e2fd
Tests 6 and 7 check the pcre2_dfa_match() alternative matching function, in
Packit Service 02e2fd
non-UTF mode and UTF-mode with Unicode property support, respectively.
Packit Service 02e2fd
Packit Service 02e2fd
Test 8 checks some internal offsets and code size features, but it is run only
Packit Service 02e2fd
when Unicode support is enabled. The output is different in 8-bit, 16-bit, and
Packit Service 02e2fd
32-bit modes and for different link sizes, so there are different output files
Packit Service 02e2fd
for each mode and link size.
Packit Service 02e2fd
Packit Service 02e2fd
Tests 9 and 10 are run only in 8-bit mode, and tests 11 and 12 are run only in
Packit Service 02e2fd
16-bit and 32-bit modes. These are tests that generate different output in
Packit Service 02e2fd
8-bit mode. Each pair are for general cases and Unicode support, respectively.
Packit Service 02e2fd
Packit Service 02e2fd
Test 13 checks the handling of non-UTF characters greater than 255 by
Packit Service 02e2fd
pcre2_dfa_match() in 16-bit and 32-bit modes.
Packit Service 02e2fd
Packit Service 02e2fd
Test 14 contains some special UTF and UCP tests that give different output for
Packit Service 02e2fd
different code unit widths.
Packit Service 02e2fd
Packit Service 02e2fd
Test 15 contains a number of tests that must not be run with JIT. They check,
Packit Service 02e2fd
among other non-JIT things, the match-limiting features of the intepretive
Packit Service 02e2fd
matcher.
Packit Service 02e2fd
Packit Service 02e2fd
Test 16 is run only when JIT support is not available. It checks that an
Packit Service 02e2fd
attempt to use JIT has the expected behaviour.
Packit Service 02e2fd
Packit Service 02e2fd
Test 17 is run only when JIT support is available. It checks JIT complete and
Packit Service 02e2fd
partial modes, match-limiting under JIT, and other JIT-specific features.
Packit Service 02e2fd
Packit Service 02e2fd
Tests 18 and 19 are run only in 8-bit mode. They check the POSIX interface to
Packit Service 02e2fd
the 8-bit library, without and with Unicode support, respectively.
Packit Service 02e2fd
Packit Service 02e2fd
Test 20 checks the serialization functions by writing a set of compiled
Packit Service 02e2fd
patterns to a file, and then reloading and checking them.
Packit Service 02e2fd
Packit Service 02e2fd
Tests 21 and 22 test \C support when the use of \C is not locked out, without
Packit Service 02e2fd
and with UTF support, respectively. Test 23 tests \C when it is locked out.
Packit Service 02e2fd
Packit Service 02e2fd
Tests 24 and 25 test the experimental pattern conversion functions, without and
Packit Service 02e2fd
with UTF support, respectively.
Packit Service 02e2fd
Packit Service 02e2fd
Packit Service 02e2fd
Character tables
Packit Service 02e2fd
----------------
Packit Service 02e2fd
Packit Service 02e2fd
For speed, PCRE2 uses four tables for manipulating and identifying characters
Packit Service 02e2fd
whose code point values are less than 256. By default, a set of tables that is
Packit Service 02e2fd
built into the library is used. The pcre2_maketables() function can be called
Packit Service 02e2fd
by an application to create a new set of tables in the current locale. This are
Packit Service 02e2fd
passed to PCRE2 by calling pcre2_set_character_tables() to put a pointer into a
Packit Service 02e2fd
compile context.
Packit Service 02e2fd
Packit Service 02e2fd
The source file called pcre2_chartables.c contains the default set of tables.
Packit Service 02e2fd
By default, this is created as a copy of pcre2_chartables.c.dist, which
Packit Service 02e2fd
contains tables for ASCII coding. However, if --enable-rebuild-chartables is
Packit Service 02e2fd
specified for ./configure, a different version of pcre2_chartables.c is built
Packit Service 02e2fd
by the program dftables (compiled from dftables.c), which uses the ANSI C
Packit Service 02e2fd
character handling functions such as isalnum(), isalpha(), isupper(),
Packit Service 02e2fd
islower(), etc. to build the table sources. This means that the default C
Packit Service 02e2fd
locale that is set for your system will control the contents of these default
Packit Service 02e2fd
tables. You can change the default tables by editing pcre2_chartables.c and
Packit Service 02e2fd
then re-building PCRE2. If you do this, you should take care to ensure that the
Packit Service 02e2fd
file does not get automatically re-generated. The best way to do this is to
Packit Service 02e2fd
move pcre2_chartables.c.dist out of the way and replace it with your customized
Packit Service 02e2fd
tables.
Packit Service 02e2fd
Packit Service 02e2fd
When the dftables program is run as a result of --enable-rebuild-chartables,
Packit Service 02e2fd
it uses the default C locale that is set on your system. It does not pay
Packit Service 02e2fd
attention to the LC_xxx environment variables. In other words, it uses the
Packit Service 02e2fd
system's default locale rather than whatever the compiling user happens to have
Packit Service 02e2fd
set. If you really do want to build a source set of character tables in a
Packit Service 02e2fd
locale that is specified by the LC_xxx variables, you can run the dftables
Packit Service 02e2fd
program by hand with the -L option. For example:
Packit Service 02e2fd
Packit Service 02e2fd
  ./dftables -L pcre2_chartables.c.special
Packit Service 02e2fd
Packit Service 02e2fd
The first two 256-byte tables provide lower casing and case flipping functions,
Packit Service 02e2fd
respectively. The next table consists of three 32-byte bit maps which identify
Packit Service 02e2fd
digits, "word" characters, and white space, respectively. These are used when
Packit Service 02e2fd
building 32-byte bit maps that represent character classes for code points less
Packit Service 02e2fd
than 256. The final 256-byte table has bits indicating various character types,
Packit Service 02e2fd
as follows:
Packit Service 02e2fd
Packit Service 02e2fd
    1   white space character
Packit Service 02e2fd
    2   letter
Packit Service 02e2fd
    4   decimal digit
Packit Service 02e2fd
    8   hexadecimal digit
Packit Service 02e2fd
   16   alphanumeric or '_'
Packit Service 02e2fd
  128   regular expression metacharacter or binary zero
Packit Service 02e2fd
Packit Service 02e2fd
You should not alter the set of characters that contain the 128 bit, as that
Packit Service 02e2fd
will cause PCRE2 to malfunction.
Packit Service 02e2fd
Packit Service 02e2fd
Packit Service 02e2fd
File manifest
Packit Service 02e2fd
-------------
Packit Service 02e2fd
Packit Service 02e2fd
The distribution should contain the files listed below.
Packit Service 02e2fd
Packit Service 02e2fd
(A) Source files for the PCRE2 library functions and their headers are found in
Packit Service 02e2fd
    the src directory:
Packit Service 02e2fd
Packit Service 02e2fd
  src/dftables.c           auxiliary program for building pcre2_chartables.c
Packit Service 02e2fd
                           when --enable-rebuild-chartables is specified
Packit Service 02e2fd
Packit Service 02e2fd
  src/pcre2_chartables.c.dist  a default set of character tables that assume
Packit Service 02e2fd
                           ASCII coding; unless --enable-rebuild-chartables is
Packit Service 02e2fd
                           specified, used by copying to pcre2_chartables.c
Packit Service 02e2fd
Packit Service 02e2fd
  src/pcre2posix.c         )
Packit Service 02e2fd
  src/pcre2_auto_possess.c )
Packit Service 02e2fd
  src/pcre2_compile.c      )
Packit Service 02e2fd
  src/pcre2_config.c       )
Packit Service 02e2fd
  src/pcre2_context.c      )
Packit Service 02e2fd
  src/pcre2_convert.c      )
Packit Service 02e2fd
  src/pcre2_dfa_match.c    )
Packit Service 02e2fd
  src/pcre2_error.c        )
Packit Service 02e2fd
  src/pcre2_extuni.c       )
Packit Service 02e2fd
  src/pcre2_find_bracket.c )
Packit Service 02e2fd
  src/pcre2_jit_compile.c  )
Packit Service 02e2fd
  src/pcre2_jit_match.c    ) sources for the functions in the library,
Packit Service 02e2fd
  src/pcre2_jit_misc.c     )   and some internal functions that they use
Packit Service 02e2fd
  src/pcre2_maketables.c   )
Packit Service 02e2fd
  src/pcre2_match.c        )
Packit Service 02e2fd
  src/pcre2_match_data.c   )
Packit Service 02e2fd
  src/pcre2_newline.c      )
Packit Service 02e2fd
  src/pcre2_ord2utf.c      )
Packit Service 02e2fd
  src/pcre2_pattern_info.c )
Packit Service 02e2fd
  src/pcre2_serialize.c    )
Packit Service 02e2fd
  src/pcre2_string_utils.c )
Packit Service 02e2fd
  src/pcre2_study.c        )
Packit Service 02e2fd
  src/pcre2_substitute.c   )
Packit Service 02e2fd
  src/pcre2_substring.c    )
Packit Service 02e2fd
  src/pcre2_tables.c       )
Packit Service 02e2fd
  src/pcre2_ucd.c          )
Packit Service 02e2fd
  src/pcre2_valid_utf.c    )
Packit Service 02e2fd
  src/pcre2_xclass.c       )
Packit Service 02e2fd
Packit Service 02e2fd
  src/pcre2_printint.c     debugging function that is used by pcre2test,
Packit Service 02e2fd
  src/pcre2_fuzzsupport.c  function for (optional) fuzzing support
Packit Service 02e2fd
Packit Service 02e2fd
  src/config.h.in          template for config.h, when built by "configure"
Packit Service 02e2fd
  src/pcre2.h.in           template for pcre2.h when built by "configure"
Packit Service 02e2fd
  src/pcre2posix.h         header for the external POSIX wrapper API
Packit Service 02e2fd
  src/pcre2_internal.h     header for internal use
Packit Service 02e2fd
  src/pcre2_intmodedep.h   a mode-specific internal header
Packit Service 02e2fd
  src/pcre2_ucp.h          header for Unicode property handling
Packit Service 02e2fd
Packit Service 02e2fd
  sljit/*                  source files for the JIT compiler
Packit Service 02e2fd
Packit Service 02e2fd
(B) Source files for programs that use PCRE2:
Packit Service 02e2fd
Packit Service 02e2fd
  src/pcre2demo.c          simple demonstration of coding calls to PCRE2
Packit Service 02e2fd
  src/pcre2grep.c          source of a grep utility that uses PCRE2
Packit Service 02e2fd
  src/pcre2test.c          comprehensive test program
Packit Service 02e2fd
  src/pcre2_jit_test.c     JIT test program
Packit Service 02e2fd
Packit Service 02e2fd
(C) Auxiliary files:
Packit Service 02e2fd
Packit Service 02e2fd
  132html                  script to turn "man" pages into HTML
Packit Service 02e2fd
  AUTHORS                  information about the author of PCRE2
Packit Service 02e2fd
  ChangeLog                log of changes to the code
Packit Service 02e2fd
  CleanTxt                 script to clean nroff output for txt man pages
Packit Service 02e2fd
  Detrail                  script to remove trailing spaces
Packit Service 02e2fd
  HACKING                  some notes about the internals of PCRE2
Packit Service 02e2fd
  INSTALL                  generic installation instructions
Packit Service 02e2fd
  LICENCE                  conditions for the use of PCRE2
Packit Service 02e2fd
  COPYING                  the same, using GNU's standard name
Packit Service 02e2fd
  Makefile.in              ) template for Unix Makefile, which is built by
Packit Service 02e2fd
                           )   "configure"
Packit Service 02e2fd
  Makefile.am              ) the automake input that was used to create
Packit Service 02e2fd
                           )   Makefile.in
Packit Service 02e2fd
  NEWS                     important changes in this release
Packit Service 02e2fd
  NON-AUTOTOOLS-BUILD      notes on building PCRE2 without using autotools
Packit Service 02e2fd
  PrepareRelease           script to make preparations for "make dist"
Packit Service 02e2fd
  README                   this file
Packit Service 02e2fd
  RunTest                  a Unix shell script for running tests
Packit Service 02e2fd
  RunGrepTest              a Unix shell script for pcre2grep tests
Packit Service 02e2fd
  aclocal.m4               m4 macros (generated by "aclocal")
Packit Service 02e2fd
  config.guess             ) files used by libtool,
Packit Service 02e2fd
  config.sub               )   used only when building a shared library
Packit Service 02e2fd
  configure                a configuring shell script (built by autoconf)
Packit Service 02e2fd
  configure.ac             ) the autoconf input that was used to build
Packit Service 02e2fd
                           )   "configure" and config.h
Packit Service 02e2fd
  depcomp                  ) script to find program dependencies, generated by
Packit Service 02e2fd
                           )   automake
Packit Service 02e2fd
  doc/*.3                  man page sources for PCRE2
Packit Service 02e2fd
  doc/*.1                  man page sources for pcre2grep and pcre2test
Packit Service 02e2fd
  doc/index.html.src       the base HTML page
Packit Service 02e2fd
  doc/html/*               HTML documentation
Packit Service 02e2fd
  doc/pcre2.txt            plain text version of the man pages
Packit Service 02e2fd
  doc/pcre2test.txt        plain text documentation of test program
Packit Service 02e2fd
  install-sh               a shell script for installing files
Packit Service 02e2fd
  libpcre2-8.pc.in         template for libpcre2-8.pc for pkg-config
Packit Service 02e2fd
  libpcre2-16.pc.in        template for libpcre2-16.pc for pkg-config
Packit Service 02e2fd
  libpcre2-32.pc.in        template for libpcre2-32.pc for pkg-config
Packit Service 02e2fd
  libpcre2-posix.pc.in     template for libpcre2-posix.pc for pkg-config
Packit Service 02e2fd
  ltmain.sh                file used to build a libtool script
Packit Service 02e2fd
  missing                  ) common stub for a few missing GNU programs while
Packit Service 02e2fd
                           )   installing, generated by automake
Packit Service 02e2fd
  mkinstalldirs            script for making install directories
Packit Service 02e2fd
  perltest.sh              Script for running a Perl test program
Packit Service 02e2fd
  pcre2-config.in          source of script which retains PCRE2 information
Packit Service 02e2fd
  testdata/testinput*      test data for main library tests
Packit Service 02e2fd
  testdata/testoutput*     expected test results
Packit Service 02e2fd
  testdata/grep*           input and output for pcre2grep tests
Packit Service 02e2fd
  testdata/*               other supporting test files
Packit Service 02e2fd
Packit Service 02e2fd
(D) Auxiliary files for cmake support
Packit Service 02e2fd
Packit Service 02e2fd
  cmake/COPYING-CMAKE-SCRIPTS
Packit Service 02e2fd
  cmake/FindPackageHandleStandardArgs.cmake
Packit Service 02e2fd
  cmake/FindEditline.cmake
Packit Service 02e2fd
  cmake/FindReadline.cmake
Packit Service 02e2fd
  CMakeLists.txt
Packit Service 02e2fd
  config-cmake.h.in
Packit Service 02e2fd
Packit Service 02e2fd
(E) Auxiliary files for building PCRE2 "by hand"
Packit Service 02e2fd
Packit Service 02e2fd
  src/pcre2.h.generic     ) a version of the public PCRE2 header file
Packit Service 02e2fd
                          )   for use in non-"configure" environments
Packit Service 02e2fd
  src/config.h.generic    ) a version of config.h for use in non-"configure"
Packit Service 02e2fd
                          )   environments
Packit Service 02e2fd
Packit Service 02e2fd
Philip Hazel
Packit Service 02e2fd
Email local part: ph10
Packit Service 02e2fd
Email domain: cam.ac.uk
Packit Service 02e2fd
Last updated: 17 June 2018