Blame man-pages-posix-2013-a/man3p/regcomp.3p

Packit 7cfc04
'\" et
Packit 7cfc04
.TH REGCOMP "3P" 2013 "IEEE/The Open Group" "POSIX Programmer's Manual"
Packit 7cfc04
.SH PROLOG
Packit 7cfc04
This manual page is part of the POSIX Programmer's Manual.
Packit 7cfc04
The Linux implementation of this interface may differ (consult
Packit 7cfc04
the corresponding Linux manual page for details of Linux behavior),
Packit 7cfc04
or the interface may not be implemented on Linux.
Packit 7cfc04
Packit 7cfc04
.SH NAME
Packit 7cfc04
regcomp,
Packit 7cfc04
regerror,
Packit 7cfc04
regexec,
Packit 7cfc04
regfree
Packit 7cfc04
\(em regular expression matching
Packit 7cfc04
.SH SYNOPSIS
Packit 7cfc04
.LP
Packit 7cfc04
.nf
Packit 7cfc04
#include <regex.h>
Packit 7cfc04
.P
Packit 7cfc04
int regcomp(regex_t *restrict \fIpreg\fP, const char *restrict \fIpattern\fP,
Packit 7cfc04
    int \fIcflags\fP);
Packit 7cfc04
size_t regerror(int \fIerrcode\fP, const regex_t *restrict \fIpreg\fP,
Packit 7cfc04
    char *restrict \fIerrbuf\fP, size_t \fIerrbuf_size\fP);
Packit 7cfc04
int regexec(const regex_t *restrict \fIpreg\fP, const char *restrict \fIstring\fP,
Packit 7cfc04
    size_t \fInmatch\fP, regmatch_t \fIpmatch\fP[restrict], int \fIeflags\fP);
Packit 7cfc04
void regfree(regex_t *\fIpreg\fP);
Packit 7cfc04
.fi
Packit 7cfc04
.SH DESCRIPTION
Packit 7cfc04
These functions interpret
Packit 7cfc04
.IR basic
Packit 7cfc04
and
Packit 7cfc04
.IR extended
Packit 7cfc04
regular expressions as described in the Base Definitions volume of POSIX.1\(hy2008,
Packit 7cfc04
.IR "Chapter 9" ", " "Regular Expressions".
Packit 7cfc04
.P
Packit 7cfc04
The
Packit 7cfc04
.BR regex_t
Packit 7cfc04
structure is defined in
Packit 7cfc04
.IR <regex.h> 
Packit 7cfc04
and contains at least the following member:
Packit 7cfc04
.TS
Packit 7cfc04
center box tab(!);
Packit 7cfc04
cB | cB | cB
Packit 7cfc04
lw(1.25i)B | lw(1.25i)I | lw(2.5i).
Packit 7cfc04
Member Type!Member Name!Description
Packit 7cfc04
_
Packit 7cfc04
size_t!re_nsub!T{
Packit 7cfc04
Number of parenthesized subexpressions.
Packit 7cfc04
T}
Packit 7cfc04
.TE
Packit 7cfc04
.P
Packit 7cfc04
The
Packit 7cfc04
.BR regmatch_t
Packit 7cfc04
structure is defined in
Packit 7cfc04
.IR <regex.h> 
Packit 7cfc04
and contains at least the following members:
Packit 7cfc04
.TS
Packit 7cfc04
center box tab(!);
Packit 7cfc04
cB | cB | cB
Packit 7cfc04
lw(1.25i)B | lw(1.25i)I | lw(2.5i).
Packit 7cfc04
Member Type!Member Name!Description
Packit 7cfc04
_
Packit 7cfc04
regoff_t!rm_so!T{
Packit 7cfc04
Byte offset from start of \fIstring\fP to start of substring.
Packit 7cfc04
T}
Packit 7cfc04
regoff_t!rm_eo!T{
Packit 7cfc04
Byte offset from start of
Packit 7cfc04
.IR string
Packit 7cfc04
of the first character after the end of substring.
Packit 7cfc04
T}
Packit 7cfc04
.TE
Packit 7cfc04
.P
Packit 7cfc04
The
Packit 7cfc04
\fIregcomp\fR()
Packit 7cfc04
function shall compile the regular expression contained in the string
Packit 7cfc04
pointed to by the
Packit 7cfc04
.IR pattern
Packit 7cfc04
argument and place the results in the structure pointed to by
Packit 7cfc04
.IR preg .
Packit 7cfc04
The
Packit 7cfc04
.IR cflags
Packit 7cfc04
argument is the bitwise-inclusive OR of zero or more of the following
Packit 7cfc04
flags, which are defined in the
Packit 7cfc04
.IR <regex.h> 
Packit 7cfc04
header:
Packit 7cfc04
.IP REG_EXTENDED 14
Packit 7cfc04
Use Extended Regular Expressions.
Packit 7cfc04
.IP REG_ICASE 14
Packit 7cfc04
Ignore case in match (see the Base Definitions volume of POSIX.1\(hy2008,
Packit 7cfc04
.IR "Chapter 9" ", " "Regular Expressions").
Packit 7cfc04
.IP REG_NOSUB 14
Packit 7cfc04
Report only success/fail in
Packit 7cfc04
\fIregexec\fR().
Packit 7cfc04
.IP REG_NEWLINE 14
Packit 7cfc04
Change the handling of
Packit 7cfc04
<newline>
Packit 7cfc04
characters, as described in the text.
Packit 7cfc04
.P
Packit 7cfc04
The default regular expression type for
Packit 7cfc04
.IR pattern
Packit 7cfc04
is a Basic Regular Expression. The application can specify Extended
Packit 7cfc04
Regular Expressions using the REG_EXTENDED
Packit 7cfc04
.IR cflags
Packit 7cfc04
flag.
Packit 7cfc04
.P
Packit 7cfc04
If the REG_NOSUB flag was not set in
Packit 7cfc04
.IR cflags ,
Packit 7cfc04
then
Packit 7cfc04
\fIregcomp\fR()
Packit 7cfc04
shall set
Packit 7cfc04
.IR re_nsub
Packit 7cfc04
to the number of parenthesized subexpressions (delimited by
Packit 7cfc04
.BR \(dq\e(\e)\(dq 
Packit 7cfc04
in basic regular expressions or
Packit 7cfc04
.BR \(dq(\|)\(dq 
Packit 7cfc04
in extended regular expressions) found in
Packit 7cfc04
.IR pattern .
Packit 7cfc04
.P
Packit 7cfc04
The
Packit 7cfc04
\fIregexec\fR()
Packit 7cfc04
function compares the null-terminated string specified by
Packit 7cfc04
.IR string
Packit 7cfc04
with the compiled regular expression
Packit 7cfc04
.IR preg
Packit 7cfc04
initialized by a previous call to
Packit 7cfc04
\fIregcomp\fR().
Packit 7cfc04
If it finds a match,
Packit 7cfc04
\fIregexec\fR()
Packit 7cfc04
shall return 0; otherwise, it shall return non-zero indicating either
Packit 7cfc04
no match or an error. The
Packit 7cfc04
.IR eflags
Packit 7cfc04
argument is the bitwise-inclusive OR of zero or more of the following
Packit 7cfc04
flags, which are defined in the
Packit 7cfc04
.IR <regex.h> 
Packit 7cfc04
header:
Packit 7cfc04
.IP REG_NOTBOL 14
Packit 7cfc04
The first character of the string pointed to by
Packit 7cfc04
.IR string
Packit 7cfc04
is not the beginning of the line. Therefore, the
Packit 7cfc04
<circumflex>
Packit 7cfc04
character
Packit 7cfc04
(\c
Packit 7cfc04
.BR '^' ),
Packit 7cfc04
when taken as a special character, shall not match the beginning of
Packit 7cfc04
.IR string .
Packit 7cfc04
.IP REG_NOTEOL 14
Packit 7cfc04
The last character of the string pointed to by
Packit 7cfc04
.IR string
Packit 7cfc04
is not the end of the line. Therefore, the
Packit 7cfc04
<dollar-sign>
Packit 7cfc04
(\c
Packit 7cfc04
.BR '$' ),
Packit 7cfc04
when taken as a special character, shall not match the end of
Packit 7cfc04
.IR string .
Packit 7cfc04
.P
Packit 7cfc04
If
Packit 7cfc04
.IR nmatch
Packit 7cfc04
is 0 or REG_NOSUB was set in the
Packit 7cfc04
.IR cflags
Packit 7cfc04
argument to
Packit 7cfc04
\fIregcomp\fR(),
Packit 7cfc04
then
Packit 7cfc04
\fIregexec\fR()
Packit 7cfc04
shall ignore the
Packit 7cfc04
.IR pmatch
Packit 7cfc04
argument. Otherwise, the application shall ensure that the
Packit 7cfc04
.IR pmatch
Packit 7cfc04
argument points to an array with at least
Packit 7cfc04
.IR nmatch
Packit 7cfc04
elements, and
Packit 7cfc04
\fIregexec\fR()
Packit 7cfc04
shall fill in the elements of that array with offsets of the substrings
Packit 7cfc04
of
Packit 7cfc04
.IR string
Packit 7cfc04
that correspond to the parenthesized subexpressions of
Packit 7cfc04
.IR pattern :
Packit 7cfc04
.IR pmatch [\c
Packit 7cfc04
.IR i ].\c
Packit 7cfc04
.IR rm_so
Packit 7cfc04
shall be the byte offset of the beginning and
Packit 7cfc04
.IR pmatch [\c
Packit 7cfc04
.IR i ].\c
Packit 7cfc04
.IR rm_eo
Packit 7cfc04
shall be one greater than the byte offset of the end of substring
Packit 7cfc04
.IR i .
Packit 7cfc04
(Subexpression
Packit 7cfc04
.IR i
Packit 7cfc04
begins at the
Packit 7cfc04
.IR i th
Packit 7cfc04
matched open parenthesis, counting from 1.) Offsets in
Packit 7cfc04
.IR pmatch [0]
Packit 7cfc04
identify the substring that corresponds to the entire regular
Packit 7cfc04
expression. Unused elements of
Packit 7cfc04
.IR pmatch
Packit 7cfc04
up to
Packit 7cfc04
.IR pmatch [\c
Packit 7cfc04
.IR nmatch \(mi1]
Packit 7cfc04
shall be filled with \(mi1. If there are more than
Packit 7cfc04
.IR nmatch
Packit 7cfc04
subexpressions in
Packit 7cfc04
.IR pattern
Packit 7cfc04
(\c
Packit 7cfc04
.IR pattern
Packit 7cfc04
itself counts as a subexpression), then
Packit 7cfc04
\fIregexec\fR()
Packit 7cfc04
shall still do the match, but shall record only the first
Packit 7cfc04
.IR nmatch
Packit 7cfc04
substrings.
Packit 7cfc04
.P
Packit 7cfc04
When matching a basic or extended regular expression, any given
Packit 7cfc04
parenthesized subexpression of
Packit 7cfc04
.IR pattern
Packit 7cfc04
might participate in the match of several different substrings of
Packit 7cfc04
.IR string ,
Packit 7cfc04
or it might not match any substring even though the pattern as a whole
Packit 7cfc04
did match. The following rules shall be used to determine which
Packit 7cfc04
substrings to report in
Packit 7cfc04
.IR pmatch
Packit 7cfc04
when matching regular expressions:
Packit 7cfc04
.IP " 1." 4
Packit 7cfc04
If subexpression
Packit 7cfc04
.IR i
Packit 7cfc04
in a regular expression is not contained within another subexpression,
Packit 7cfc04
and it participated in the match several times, then the byte offsets
Packit 7cfc04
in
Packit 7cfc04
.IR pmatch [\c
Packit 7cfc04
.IR i ]
Packit 7cfc04
shall delimit the last such match.
Packit 7cfc04
.IP " 2." 4
Packit 7cfc04
If subexpression
Packit 7cfc04
.IR i
Packit 7cfc04
is not contained within another subexpression, and it did not
Packit 7cfc04
participate in an otherwise successful match, the byte offsets in
Packit 7cfc04
.IR pmatch [\c
Packit 7cfc04
.IR i ]
Packit 7cfc04
shall be \(mi1. A subexpression does not participate in the match when:
Packit 7cfc04
.sp
Packit 7cfc04
.RS
Packit 7cfc04
.BR '*' 
Packit 7cfc04
or
Packit 7cfc04
.BR \(dq\e{\e}\(dq 
Packit 7cfc04
appears immediately after the subexpression in a basic regular
Packit 7cfc04
expression, or
Packit 7cfc04
.BR '*' ,
Packit 7cfc04
.BR '?' ,
Packit 7cfc04
or
Packit 7cfc04
.BR \(dq{\|}\(dq 
Packit 7cfc04
appears immediately after the subexpression in an extended regular
Packit 7cfc04
expression, and the subexpression did not match (matched 0 times)
Packit 7cfc04
.RE
Packit 7cfc04
.RS 4 
Packit 7cfc04
.P
Packit 7cfc04
or:
Packit 7cfc04
.sp
Packit 7cfc04
.RS
Packit 7cfc04
.BR '|' 
Packit 7cfc04
is used in an extended regular expression to select this subexpression
Packit 7cfc04
or another, and the other subexpression matched.
Packit 7cfc04
.RE
Packit 7cfc04
.RE
Packit 7cfc04
.IP " 3." 4
Packit 7cfc04
If subexpression
Packit 7cfc04
.IR i
Packit 7cfc04
is contained within another subexpression
Packit 7cfc04
.IR j ,
Packit 7cfc04
and
Packit 7cfc04
.IR i
Packit 7cfc04
is not contained within any other subexpression that is contained
Packit 7cfc04
within
Packit 7cfc04
.IR j ,
Packit 7cfc04
and a match of subexpression
Packit 7cfc04
.IR j
Packit 7cfc04
is reported in
Packit 7cfc04
.IR pmatch [\c
Packit 7cfc04
.IR j ],
Packit 7cfc04
then the match or non-match of subexpression
Packit 7cfc04
.IR i
Packit 7cfc04
reported in
Packit 7cfc04
.IR pmatch [\c
Packit 7cfc04
.IR i ]
Packit 7cfc04
shall be as described in 1. and 2. above, but within the substring
Packit 7cfc04
reported in
Packit 7cfc04
.IR pmatch [\c
Packit 7cfc04
.IR j ]
Packit 7cfc04
rather than the whole string. The offsets in
Packit 7cfc04
.IR pmatch [\c
Packit 7cfc04
.IR i ]
Packit 7cfc04
are still relative to the start of
Packit 7cfc04
.IR string .
Packit 7cfc04
.IP " 4." 4
Packit 7cfc04
If subexpression
Packit 7cfc04
.IR i
Packit 7cfc04
is contained in subexpression
Packit 7cfc04
.IR j ,
Packit 7cfc04
and the byte offsets in
Packit 7cfc04
.IR pmatch [\c
Packit 7cfc04
.IR j ]
Packit 7cfc04
are \(mi1, then the pointers in
Packit 7cfc04
.IR pmatch [\c
Packit 7cfc04
.IR i ]
Packit 7cfc04
shall also be \(mi1.
Packit 7cfc04
.IP " 5." 4
Packit 7cfc04
If subexpression
Packit 7cfc04
.IR i
Packit 7cfc04
matched a zero-length string, then both byte offsets in
Packit 7cfc04
.IR pmatch [\c
Packit 7cfc04
.IR i ]
Packit 7cfc04
shall be the byte offset of the character or null terminator
Packit 7cfc04
immediately following the zero-length string.
Packit 7cfc04
.P
Packit 7cfc04
If, when
Packit 7cfc04
\fIregexec\fR()
Packit 7cfc04
is called, the locale is different from when the regular expression was
Packit 7cfc04
compiled, the result is undefined.
Packit 7cfc04
.P
Packit 7cfc04
If REG_NEWLINE is not set in
Packit 7cfc04
.IR cflags ,
Packit 7cfc04
then a
Packit 7cfc04
<newline>
Packit 7cfc04
in
Packit 7cfc04
.IR pattern
Packit 7cfc04
or
Packit 7cfc04
.IR string
Packit 7cfc04
shall be treated as an ordinary character. If REG_NEWLINE is set, then
Packit 7cfc04
<newline>
Packit 7cfc04
shall be treated as an ordinary character except as follows:
Packit 7cfc04
.IP " 1." 4
Packit 7cfc04
A
Packit 7cfc04
<newline>
Packit 7cfc04
in
Packit 7cfc04
.IR string
Packit 7cfc04
shall not be matched by a
Packit 7cfc04
<period>
Packit 7cfc04
outside a bracket expression or by any form of a non-matching list
Packit 7cfc04
(see the Base Definitions volume of POSIX.1\(hy2008,
Packit 7cfc04
.IR "Chapter 9" ", " "Regular Expressions").
Packit 7cfc04
.IP " 2." 4
Packit 7cfc04
A
Packit 7cfc04
<circumflex>
Packit 7cfc04
(\c
Packit 7cfc04
.BR '^' )
Packit 7cfc04
in
Packit 7cfc04
.IR pattern ,
Packit 7cfc04
when used to specify expression anchoring (see the Base Definitions volume of POSIX.1\(hy2008,
Packit 7cfc04
.IR "Section 9.3.8" ", " "BRE Expression Anchoring"),
Packit 7cfc04
shall match the zero-length string immediately after a
Packit 7cfc04
<newline>
Packit 7cfc04
in
Packit 7cfc04
.IR string ,
Packit 7cfc04
regardless of the setting of REG_NOTBOL.
Packit 7cfc04
.IP " 3." 4
Packit 7cfc04
A
Packit 7cfc04
<dollar-sign>
Packit 7cfc04
(\c
Packit 7cfc04
.BR '$' )
Packit 7cfc04
in
Packit 7cfc04
.IR pattern ,
Packit 7cfc04
when used to specify expression anchoring, shall match the zero-length
Packit 7cfc04
string immediately before a
Packit 7cfc04
<newline>
Packit 7cfc04
in
Packit 7cfc04
.IR string ,
Packit 7cfc04
regardless of the setting of REG_NOTEOL.
Packit 7cfc04
.P
Packit 7cfc04
The
Packit 7cfc04
\fIregfree\fR()
Packit 7cfc04
function frees any memory allocated by
Packit 7cfc04
\fIregcomp\fR()
Packit 7cfc04
associated with
Packit 7cfc04
.IR preg .
Packit 7cfc04
.P
Packit 7cfc04
The following constants are defined as the minimum set of error return
Packit 7cfc04
values, although other errors listed as implementation extensions in
Packit 7cfc04
.IR <regex.h> 
Packit 7cfc04
are possible:
Packit 7cfc04
.IP REG_BADBR 14
Packit 7cfc04
Content of
Packit 7cfc04
.BR \(dq\e{\e}\(dq 
Packit 7cfc04
invalid: not a number, number too large, more than two numbers, first
Packit 7cfc04
larger than second.
Packit 7cfc04
.IP REG_BADPAT 14
Packit 7cfc04
Invalid regular expression.
Packit 7cfc04
.IP REG_BADRPT 14
Packit 7cfc04
.BR '?' ,
Packit 7cfc04
.BR '*' ,
Packit 7cfc04
or
Packit 7cfc04
.BR '+' 
Packit 7cfc04
not preceded by valid regular expression.
Packit 7cfc04
.IP REG_EBRACE 14
Packit 7cfc04
.BR \(dq\e{\e}\(dq 
Packit 7cfc04
imbalance.
Packit 7cfc04
.IP REG_EBRACK 14
Packit 7cfc04
.BR \(dq[]\(dq 
Packit 7cfc04
imbalance.
Packit 7cfc04
.IP REG_ECOLLATE 14
Packit 7cfc04
Invalid collating element referenced.
Packit 7cfc04
.IP REG_ECTYPE 14
Packit 7cfc04
Invalid character class type referenced.
Packit 7cfc04
.IP REG_EESCAPE 14
Packit 7cfc04
Trailing
Packit 7cfc04
<backslash>
Packit 7cfc04
character in pattern.
Packit 7cfc04
.IP REG_EPAREN 14
Packit 7cfc04
.BR \(dq\e(\e)\(dq 
Packit 7cfc04
or
Packit 7cfc04
.BR \(dq()\(dq 
Packit 7cfc04
imbalance.
Packit 7cfc04
.IP REG_ERANGE 14
Packit 7cfc04
Invalid endpoint in range expression.
Packit 7cfc04
.IP REG_ESPACE 14
Packit 7cfc04
Out of memory.
Packit 7cfc04
.IP REG_ESUBREG 14
Packit 7cfc04
Number in
Packit 7cfc04
.BR \(dq\edigit\(dq 
Packit 7cfc04
invalid or in error.
Packit 7cfc04
.IP REG_NOMATCH 14
Packit 7cfc04
\fIregexec\fR()
Packit 7cfc04
failed to match.
Packit 7cfc04
.P
Packit 7cfc04
If more than one error occurs in processing a function call, any one
Packit 7cfc04
of the possible constants may be returned, as the order of detection is
Packit 7cfc04
unspecified.
Packit 7cfc04
.P
Packit 7cfc04
The
Packit 7cfc04
\fIregerror\fR()
Packit 7cfc04
function provides a mapping from error codes returned by
Packit 7cfc04
\fIregcomp\fR()
Packit 7cfc04
and
Packit 7cfc04
\fIregexec\fR()
Packit 7cfc04
to unspecified printable strings. It generates a string corresponding
Packit 7cfc04
to the value of the
Packit 7cfc04
.IR errcode
Packit 7cfc04
argument, which the application shall ensure is the last non-zero value
Packit 7cfc04
returned by
Packit 7cfc04
\fIregcomp\fR()
Packit 7cfc04
or
Packit 7cfc04
\fIregexec\fR()
Packit 7cfc04
with the given value of
Packit 7cfc04
.IR preg .
Packit 7cfc04
If
Packit 7cfc04
.IR errcode
Packit 7cfc04
is not such a value, the content of the generated string is unspecified.
Packit 7cfc04
.P
Packit 7cfc04
If
Packit 7cfc04
.IR preg
Packit 7cfc04
is a null pointer, but
Packit 7cfc04
.IR errcode
Packit 7cfc04
is a value returned by a previous call to
Packit 7cfc04
\fIregexec\fR()
Packit 7cfc04
or
Packit 7cfc04
\fIregcomp\fR(),
Packit 7cfc04
the
Packit 7cfc04
\fIregerror\fR()
Packit 7cfc04
still generates an error string corresponding to the value of
Packit 7cfc04
.IR errcode ,
Packit 7cfc04
but it might not be as detailed under some implementations.
Packit 7cfc04
.P
Packit 7cfc04
If the
Packit 7cfc04
.IR errbuf_size
Packit 7cfc04
argument is not 0,
Packit 7cfc04
\fIregerror\fR()
Packit 7cfc04
shall place the generated string into the buffer of size
Packit 7cfc04
.IR errbuf_size
Packit 7cfc04
bytes pointed to by
Packit 7cfc04
.IR errbuf .
Packit 7cfc04
If the string (including the terminating null) cannot fit in the
Packit 7cfc04
buffer,
Packit 7cfc04
\fIregerror\fR()
Packit 7cfc04
shall truncate the string and null-terminate the result.
Packit 7cfc04
.P
Packit 7cfc04
If
Packit 7cfc04
.IR errbuf_size
Packit 7cfc04
is 0,
Packit 7cfc04
\fIregerror\fR()
Packit 7cfc04
shall ignore the
Packit 7cfc04
.IR errbuf
Packit 7cfc04
argument, and return the size of the buffer needed to hold the
Packit 7cfc04
generated string.
Packit 7cfc04
.P
Packit 7cfc04
If the
Packit 7cfc04
.IR preg
Packit 7cfc04
argument to
Packit 7cfc04
\fIregexec\fR()
Packit 7cfc04
or
Packit 7cfc04
\fIregfree\fR()
Packit 7cfc04
is not a compiled regular expression returned by
Packit 7cfc04
\fIregcomp\fR(),
Packit 7cfc04
the result is undefined. A
Packit 7cfc04
.IR preg
Packit 7cfc04
is no longer treated as a compiled regular expression after it is given
Packit 7cfc04
to
Packit 7cfc04
\fIregfree\fR().
Packit 7cfc04
.SH "RETURN VALUE"
Packit 7cfc04
Upon successful completion, the
Packit 7cfc04
\fIregcomp\fR()
Packit 7cfc04
function shall return 0. Otherwise, it shall return an integer value
Packit 7cfc04
indicating an error as described in
Packit 7cfc04
.IR <regex.h> ,
Packit 7cfc04
and the content of
Packit 7cfc04
.IR preg
Packit 7cfc04
is undefined. If a code is returned, the interpretation shall be as
Packit 7cfc04
given in
Packit 7cfc04
.IR <regex.h> .
Packit 7cfc04
.P
Packit 7cfc04
If
Packit 7cfc04
\fIregcomp\fR()
Packit 7cfc04
detects an invalid RE, it may return REG_BADPAT, or it may return one
Packit 7cfc04
of the error codes that more precisely describes the error.
Packit 7cfc04
.P
Packit 7cfc04
Upon successful completion, the
Packit 7cfc04
\fIregexec\fR()
Packit 7cfc04
function shall return 0. Otherwise, it shall return REG_NOMATCH to
Packit 7cfc04
indicate no match.
Packit 7cfc04
.P
Packit 7cfc04
Upon successful completion, the
Packit 7cfc04
\fIregerror\fR()
Packit 7cfc04
function shall return the number of bytes needed to hold the entire
Packit 7cfc04
generated string, including the null termination. If the return value
Packit 7cfc04
is greater than
Packit 7cfc04
.IR errbuf_size ,
Packit 7cfc04
the string returned in the buffer pointed to by
Packit 7cfc04
.IR errbuf
Packit 7cfc04
has been truncated.
Packit 7cfc04
.P
Packit 7cfc04
The
Packit 7cfc04
\fIregfree\fR()
Packit 7cfc04
function shall not return a value.
Packit 7cfc04
.SH ERRORS
Packit 7cfc04
No errors are defined.
Packit 7cfc04
.LP
Packit 7cfc04
.IR "The following sections are informative."
Packit 7cfc04
.SH "EXAMPLES"
Packit 7cfc04
.sp
Packit 7cfc04
.RS 4
Packit 7cfc04
.nf
Packit 7cfc04
\fB
Packit 7cfc04
#include <regex.h>
Packit 7cfc04
.P
Packit 7cfc04
/*
Packit 7cfc04
 * Match string against the extended regular expression in
Packit 7cfc04
 * pattern, treating errors as no match.
Packit 7cfc04
 *
Packit 7cfc04
 * Return 1 for match, 0 for no match.
Packit 7cfc04
 */
Packit 7cfc04
.P
Packit 7cfc04
int
Packit 7cfc04
match(const char *string, char *pattern)
Packit 7cfc04
{
Packit 7cfc04
    int    status;
Packit 7cfc04
    regex_t    re;
Packit 7cfc04
.P
Packit 7cfc04
    if (regcomp(&re, pattern, REG_EXTENDED|REG_NOSUB) != 0) {
Packit 7cfc04
        return(0);      /* Report error. */
Packit 7cfc04
    }
Packit 7cfc04
    status = regexec(&re, string, (size_t) 0, NULL, 0);
Packit 7cfc04
    regfree(&re);
Packit 7cfc04
    if (status != 0) {
Packit 7cfc04
        return(0);      /* Report error. */
Packit 7cfc04
    }
Packit 7cfc04
    return(1);
Packit 7cfc04
}
Packit 7cfc04
.fi \fR
Packit 7cfc04
.P
Packit 7cfc04
.RE
Packit 7cfc04
.P
Packit 7cfc04
The following demonstrates how the REG_NOTBOL flag could be used with
Packit 7cfc04
\fIregexec\fR()
Packit 7cfc04
to find all substrings in a line that match a pattern supplied by a user.
Packit 7cfc04
(For simplicity of the example, very little error checking is done.)
Packit 7cfc04
.sp
Packit 7cfc04
.RS 4
Packit 7cfc04
.nf
Packit 7cfc04
\fB
Packit 7cfc04
(void) regcomp (&re, pattern, 0);
Packit 7cfc04
/* This call to regexec() finds the first match on the line. */
Packit 7cfc04
error = regexec (&re, &buffer[0], 1, &pm, 0);
Packit 7cfc04
while (error == 0) {  /* While matches found. */
Packit 7cfc04
    /* Substring found between pm.rm_so and pm.rm_eo. */
Packit 7cfc04
    /* This call to regexec() finds the next match. */
Packit 7cfc04
    error = regexec (&re, buffer + pm.rm_eo, 1, &pm, REG_NOTBOL);
Packit 7cfc04
}
Packit 7cfc04
.fi \fR
Packit 7cfc04
.P
Packit 7cfc04
.RE
Packit 7cfc04
.SH "APPLICATION USAGE"
Packit 7cfc04
An application could use:
Packit 7cfc04
.sp
Packit 7cfc04
.RS 4
Packit 7cfc04
.nf
Packit 7cfc04
\fB
Packit 7cfc04
regerror(code,preg,(char *)NULL,(size_t)0)
Packit 7cfc04
.fi \fR
Packit 7cfc04
.P
Packit 7cfc04
.RE
Packit 7cfc04
.P
Packit 7cfc04
to find out how big a buffer is needed for the generated string,
Packit 7cfc04
\fImalloc\fR()
Packit 7cfc04
a buffer to hold the string, and then call
Packit 7cfc04
\fIregerror\fR()
Packit 7cfc04
again to get the string. Alternatively, it could allocate a fixed,
Packit 7cfc04
static buffer that is big enough to hold most strings, and then use
Packit 7cfc04
\fImalloc\fR()
Packit 7cfc04
to allocate a larger buffer if it finds that this is too small.
Packit 7cfc04
.P
Packit 7cfc04
To match a pattern as described in the Shell and Utilities volume of POSIX.1\(hy2008,
Packit 7cfc04
.IR "Section 2.13" ", " "Pattern Matching Notation",
Packit 7cfc04
use the
Packit 7cfc04
\fIfnmatch\fR()
Packit 7cfc04
function.
Packit 7cfc04
.SH RATIONALE
Packit 7cfc04
The
Packit 7cfc04
\fIregexec\fR()
Packit 7cfc04
function must fill in all
Packit 7cfc04
.IR nmatch
Packit 7cfc04
elements of
Packit 7cfc04
.IR pmatch ,
Packit 7cfc04
where
Packit 7cfc04
.IR nmatch
Packit 7cfc04
and
Packit 7cfc04
.IR pmatch
Packit 7cfc04
are supplied by the application, even if some elements of
Packit 7cfc04
.IR pmatch
Packit 7cfc04
do not correspond to subexpressions in
Packit 7cfc04
.IR pattern .
Packit 7cfc04
The application developer should note that there is probably no reason
Packit 7cfc04
for using a value of
Packit 7cfc04
.IR nmatch
Packit 7cfc04
that is larger than
Packit 7cfc04
.IR preg \(mi>\c
Packit 7cfc04
.IR re_nsub +1.
Packit 7cfc04
.P
Packit 7cfc04
The REG_NEWLINE flag supports a use of RE matching that is needed in
Packit 7cfc04
some applications like text editors. In such applications, the user
Packit 7cfc04
supplies an RE asking the application to find a line that matches the
Packit 7cfc04
given expression. An anchor in such an RE anchors at the beginning or
Packit 7cfc04
end of any line. Such an application can pass a sequence of
Packit 7cfc04
<newline>-separated
Packit 7cfc04
lines to
Packit 7cfc04
\fIregexec\fR()
Packit 7cfc04
as a single long string and specify REG_NEWLINE to
Packit 7cfc04
\fIregcomp\fR()
Packit 7cfc04
to get the desired behavior. The application must ensure that there are
Packit 7cfc04
no explicit
Packit 7cfc04
<newline>
Packit 7cfc04
characters in
Packit 7cfc04
.IR pattern
Packit 7cfc04
if it wants to ensure that any match occurs entirely within a single
Packit 7cfc04
line.
Packit 7cfc04
.P
Packit 7cfc04
The REG_NEWLINE flag affects the behavior of
Packit 7cfc04
\fIregexec\fR(),
Packit 7cfc04
but it is in the
Packit 7cfc04
.IR cflags
Packit 7cfc04
parameter to
Packit 7cfc04
\fIregcomp\fR()
Packit 7cfc04
to allow flexibility of implementation. Some implementations will want
Packit 7cfc04
to generate the same compiled RE in
Packit 7cfc04
\fIregcomp\fR()
Packit 7cfc04
regardless of the setting of REG_NEWLINE and have
Packit 7cfc04
\fIregexec\fR()
Packit 7cfc04
handle anchors differently based on the setting of the flag. Other
Packit 7cfc04
implementations will generate different compiled REs based on the
Packit 7cfc04
REG_NEWLINE.
Packit 7cfc04
.P
Packit 7cfc04
The REG_ICASE flag supports the operations taken by the
Packit 7cfc04
.IR grep
Packit 7cfc04
.BR \(mii
Packit 7cfc04
option and the historical implementations of
Packit 7cfc04
.IR ex
Packit 7cfc04
and
Packit 7cfc04
.IR vi .
Packit 7cfc04
Including this flag will make it easier for application code to be
Packit 7cfc04
written that does the same thing as these utilities.
Packit 7cfc04
.P
Packit 7cfc04
The substrings reported in
Packit 7cfc04
.IR pmatch [\|]
Packit 7cfc04
are defined using offsets from the start of the string rather than
Packit 7cfc04
pointers. This allows type-safe access to both constant and non-constant
Packit 7cfc04
strings.
Packit 7cfc04
.P
Packit 7cfc04
The type
Packit 7cfc04
.BR regoff_t
Packit 7cfc04
is used for the elements of
Packit 7cfc04
.IR pmatch [\|]
Packit 7cfc04
to ensure that the application can represent large arrays in memory
Packit 7cfc04
(important for an application conforming to the Shell and Utilities volume of POSIX.1\(hy2008).
Packit 7cfc04
.P
Packit 7cfc04
The 1992 edition of this standard required
Packit 7cfc04
.BR regoff_t
Packit 7cfc04
to be at least as wide as
Packit 7cfc04
.BR off_t ,
Packit 7cfc04
to facilitate future extensions in which the string to be searched is
Packit 7cfc04
taken from a file. However, these future extensions have not appeared.
Packit 7cfc04
The requirement rules out popular implementations with 32-bit
Packit 7cfc04
.BR regoff_t
Packit 7cfc04
and 64-bit
Packit 7cfc04
.BR off_t ,
Packit 7cfc04
so it has been removed.
Packit 7cfc04
.P
Packit 7cfc04
The standard developers rejected the inclusion of a
Packit 7cfc04
\fIregsub\fR()
Packit 7cfc04
function that would be used to do substitutions for a matched RE. While
Packit 7cfc04
such a routine would be useful to some applications, its utility would
Packit 7cfc04
be much more limited than the matching function described here. Both RE
Packit 7cfc04
parsing and substitution are possible to implement without support
Packit 7cfc04
other than that required by the ISO\ C standard, but matching is much more
Packit 7cfc04
complex than substituting. The only difficult part of substitution,
Packit 7cfc04
given the information supplied by
Packit 7cfc04
\fIregexec\fR(),
Packit 7cfc04
is finding the next character in a string when there can be multi-byte
Packit 7cfc04
characters. That is a much larger issue, and one that needs a more
Packit 7cfc04
general solution.
Packit 7cfc04
.P
Packit 7cfc04
The
Packit 7cfc04
.IR errno
Packit 7cfc04
variable has not been used for error returns to avoid filling the
Packit 7cfc04
.IR errno
Packit 7cfc04
name space for this feature.
Packit 7cfc04
.P
Packit 7cfc04
The interface is defined so that the matched substrings
Packit 7cfc04
.IR rm_sp
Packit 7cfc04
and
Packit 7cfc04
.IR rm_ep
Packit 7cfc04
are in a separate
Packit 7cfc04
.BR regmatch_t
Packit 7cfc04
structure instead of in
Packit 7cfc04
.BR regex_t .
Packit 7cfc04
This allows a single compiled RE to be used simultaneously in several
Packit 7cfc04
contexts; in
Packit 7cfc04
\fImain\fR()
Packit 7cfc04
and a signal handler, perhaps, or in multiple threads of lightweight
Packit 7cfc04
processes. (The
Packit 7cfc04
.IR preg
Packit 7cfc04
argument to
Packit 7cfc04
\fIregexec\fR()
Packit 7cfc04
is declared with type
Packit 7cfc04
.BR const ,
Packit 7cfc04
so the implementation is not permitted to use the structure to store
Packit 7cfc04
intermediate results.) It also allows an application to request an
Packit 7cfc04
arbitrary number of substrings from an RE. The number of
Packit 7cfc04
subexpressions in the RE is reported in
Packit 7cfc04
.IR re_nsub
Packit 7cfc04
in
Packit 7cfc04
.IR preg .
Packit 7cfc04
With this change to
Packit 7cfc04
\fIregexec\fR(),
Packit 7cfc04
consideration was given to dropping the REG_NOSUB flag since the user
Packit 7cfc04
can now specify this with a zero
Packit 7cfc04
.IR nmatch
Packit 7cfc04
argument to
Packit 7cfc04
\fIregexec\fR().
Packit 7cfc04
However, keeping REG_NOSUB allows an implementation to use a different
Packit 7cfc04
(perhaps more efficient) algorithm if it knows in
Packit 7cfc04
\fIregcomp\fR()
Packit 7cfc04
that no subexpressions need be reported. The implementation is only
Packit 7cfc04
required to fill in
Packit 7cfc04
.IR pmatch
Packit 7cfc04
if
Packit 7cfc04
.IR nmatch
Packit 7cfc04
is not zero and if REG_NOSUB is not specified. Note that the
Packit 7cfc04
.BR size_t
Packit 7cfc04
type, as defined in the ISO\ C standard, is unsigned, so the description of
Packit 7cfc04
\fIregexec\fR()
Packit 7cfc04
does not need to address negative values of
Packit 7cfc04
.IR nmatch .
Packit 7cfc04
.P
Packit 7cfc04
REG_NOTBOL was added to allow an application to do repeated searches
Packit 7cfc04
for the same pattern in a line. If the pattern contains a
Packit 7cfc04
<circumflex>
Packit 7cfc04
character that should match the beginning of a line, then the pattern
Packit 7cfc04
should only match when matched against the beginning of the line.
Packit 7cfc04
Without the REG_NOTBOL flag, the application could rewrite the
Packit 7cfc04
expression for subsequent matches, but in the general case this would
Packit 7cfc04
require parsing the expression. The need for REG_NOTEOL is not as
Packit 7cfc04
clear; it was added for symmetry.
Packit 7cfc04
.P
Packit 7cfc04
The addition of the
Packit 7cfc04
\fIregerror\fR()
Packit 7cfc04
function addresses the historical need for conforming application
Packit 7cfc04
programs to have access to error information more than ``Function
Packit 7cfc04
failed to compile/match your RE for unknown reasons''.
Packit 7cfc04
.P
Packit 7cfc04
This interface provides for two different methods of dealing with error
Packit 7cfc04
conditions. The specific error codes (REG_EBRACE, for example), defined
Packit 7cfc04
in
Packit 7cfc04
.IR <regex.h> ,
Packit 7cfc04
allow an application to recover from an error if it is so able. Many
Packit 7cfc04
applications, especially those that use patterns supplied by a user,
Packit 7cfc04
will not try to deal with specific error cases, but will just use
Packit 7cfc04
\fIregerror\fR()
Packit 7cfc04
to obtain a human-readable error message to present to the user.
Packit 7cfc04
.P
Packit 7cfc04
The
Packit 7cfc04
\fIregerror\fR()
Packit 7cfc04
function uses a scheme similar to
Packit 7cfc04
\fIconfstr\fR()
Packit 7cfc04
to deal with the problem of allocating memory to hold the generated
Packit 7cfc04
string. The scheme used by
Packit 7cfc04
\fIstrerror\fR()
Packit 7cfc04
in the ISO\ C standard was considered unacceptable since it creates difficulties
Packit 7cfc04
for multi-threaded applications.
Packit 7cfc04
.P
Packit 7cfc04
The
Packit 7cfc04
.IR preg
Packit 7cfc04
argument is provided to
Packit 7cfc04
\fIregerror\fR()
Packit 7cfc04
to allow an implementation to generate a more descriptive message than
Packit 7cfc04
would be possible with
Packit 7cfc04
.IR errcode
Packit 7cfc04
alone. An implementation might, for example, save the character offset
Packit 7cfc04
of the offending character of the pattern in a field of
Packit 7cfc04
.IR preg ,
Packit 7cfc04
and then include that in the generated message string. The
Packit 7cfc04
implementation may also ignore
Packit 7cfc04
.IR preg .
Packit 7cfc04
.P
Packit 7cfc04
A REG_FILENAME flag was considered, but omitted. This flag caused
Packit 7cfc04
\fIregexec\fR()
Packit 7cfc04
to match patterns as described in the Shell and Utilities volume of POSIX.1\(hy2008,
Packit 7cfc04
.IR "Section 2.13" ", " "Pattern Matching Notation"
Packit 7cfc04
instead of REs. This service is now provided by the
Packit 7cfc04
\fIfnmatch\fR()
Packit 7cfc04
function.
Packit 7cfc04
.P
Packit 7cfc04
Notice that there is a difference in philosophy between the ISO\ POSIX\(hy2:\|1993 standard and
Packit 7cfc04
POSIX.1\(hy2008 in how to handle a ``bad'' regular expression. The ISO\ POSIX\(hy2:\|1993 standard says
Packit 7cfc04
that many bad constructs ``produce undefined results'', or that
Packit 7cfc04
``the interpretation is undefined''. POSIX.1\(hy2008, however, says that the
Packit 7cfc04
interpretation of such REs is unspecified. The term ``undefined'' means
Packit 7cfc04
that the action by the application is an error, of similar severity
Packit 7cfc04
to passing a bad pointer to a function.
Packit 7cfc04
.P
Packit 7cfc04
The
Packit 7cfc04
\fIregcomp\fR()
Packit 7cfc04
and
Packit 7cfc04
\fIregexec\fR()
Packit 7cfc04
functions are required to accept any null-terminated string as the
Packit 7cfc04
.IR pattern
Packit 7cfc04
argument. If the meaning of the string is ``undefined'', the behavior
Packit 7cfc04
of the function is ``unspecified''. POSIX.1\(hy2008 does not specify how the
Packit 7cfc04
functions will interpret the pattern; they might return error codes, or
Packit 7cfc04
they might do pattern matching in some completely unexpected way, but
Packit 7cfc04
they should not do something like abort the process.
Packit 7cfc04
.SH "FUTURE DIRECTIONS"
Packit 7cfc04
None.
Packit 7cfc04
.SH "SEE ALSO"
Packit 7cfc04
.IR "\fIfnmatch\fR\^(\|)",
Packit 7cfc04
.IR "\fIglob\fR\^(\|)"
Packit 7cfc04
.P
Packit 7cfc04
The Base Definitions volume of POSIX.1\(hy2008,
Packit 7cfc04
.IR "Chapter 9" ", " "Regular Expressions",
Packit 7cfc04
.IR "\fB<regex.h>\fP",
Packit 7cfc04
.IR "\fB<sys_types.h>\fP"
Packit 7cfc04
.P
Packit 7cfc04
The Shell and Utilities volume of POSIX.1\(hy2008,
Packit 7cfc04
.IR "Section 2.13" ", " "Pattern Matching Notation"
Packit 7cfc04
.SH COPYRIGHT
Packit 7cfc04
Portions of this text are reprinted and reproduced in electronic form
Packit 7cfc04
from IEEE Std 1003.1, 2013 Edition, Standard for Information Technology
Packit 7cfc04
-- Portable Operating System Interface (POSIX), The Open Group Base
Packit 7cfc04
Specifications Issue 7, Copyright (C) 2013 by the Institute of
Packit 7cfc04
Electrical and Electronics Engineers, Inc and The Open Group.
Packit 7cfc04
(This is POSIX.1-2008 with the 2013 Technical Corrigendum 1 applied.) In the
Packit 7cfc04
event of any discrepancy between this version and the original IEEE and
Packit 7cfc04
The Open Group Standard, the original IEEE and The Open Group Standard
Packit 7cfc04
is the referee document. The original Standard can be obtained online at
Packit 7cfc04
http://www.unix.org/online.html .
Packit 7cfc04
Packit 7cfc04
Any typographical or formatting errors that appear
Packit 7cfc04
in this page are most likely
Packit 7cfc04
to have been introduced during the conversion of the source files to
Packit 7cfc04
man page format. To report such errors, see
Packit 7cfc04
https://www.kernel.org/doc/man-pages/reporting_bugs.html .