Blame README.md

Packit b89d10
Oniguruma
Packit b89d10
=========
Packit b89d10
Packit b89d10
https://github.com/kkos/oniguruma
Packit b89d10
Packit b89d10
Oniguruma is a modern and flexible regular expressions library. It
Packit b89d10
encompasses features from different regular expression implementations
Packit b89d10
that traditionally exist in different languages. It comes close to
Packit b89d10
being a complete superset of all regular expression features found
Packit b89d10
in other regular expression implementations.
Packit b89d10
Packit b89d10
Its features include:
Packit b89d10
* Character encoding can be specified per regular expression object.
Packit b89d10
* Several regular expression types are supported:
Packit b89d10
  * Oniguruma (native)
Packit b89d10
  * POSIX
Packit b89d10
  * Grep
Packit b89d10
  * GNU Regex
Packit b89d10
  * Perl
Packit b89d10
  * Java
Packit b89d10
  * Ruby
Packit b89d10
  * Emacs
Packit b89d10
Packit b89d10
Supported character encodings:
Packit b89d10
Packit b89d10
  ASCII, UTF-8, UTF-16BE, UTF-16LE, UTF-32BE, UTF-32LE,
Packit b89d10
  EUC-JP, EUC-TW, EUC-KR, EUC-CN,
Packit b89d10
  Shift_JIS, Big5, GB18030, KOI8-R, CP1251,
Packit b89d10
  ISO-8859-1, ISO-8859-2, ISO-8859-3, ISO-8859-4, ISO-8859-5,
Packit b89d10
  ISO-8859-6, ISO-8859-7, ISO-8859-8, ISO-8859-9, ISO-8859-10,
Packit b89d10
  ISO-8859-11, ISO-8859-13, ISO-8859-14, ISO-8859-15, ISO-8859-16
Packit b89d10
Packit b89d10
* GB18030: contributed by KUBO Takehiro
Packit b89d10
* CP1251:  contributed by Byte
Packit b89d10
Packit b89d10
Packit b89d10
New feature of version 6.8.2
Packit b89d10
--------------------------
Packit b89d10
Packit b89d10
* Fix: #80 UChar in header causes issue
Packit b89d10
* NEW API: onig_set_callout_user_data_of_match_param()  (* omission in 6.8.0)
Packit b89d10
* add doc/CALLOUTS.API and doc/CALLOUTS.API.ja
Packit b89d10
Packit b89d10
Packit b89d10
New feature of version 6.8.1
Packit b89d10
--------------------------
Packit b89d10
Packit b89d10
* Update shared library version to 5.0.0 for API incompatible changes from 6.7.1
Packit b89d10
Packit b89d10
Packit b89d10
New feature of version 6.8.0
Packit b89d10
--------------------------
Packit b89d10
Packit b89d10
* Retry-limit-in-match function enabled by default
Packit b89d10
* NEW: configure option --enable-posix-api=no  (* enabled by default)
Packit b89d10
* NEW API: onig_search_with_param(), onig_match_with_param()
Packit b89d10
* NEW: Callouts of contents  (?{...contents...}) (?{...}\[tag]\[X<>]) (?{{...}})
Packit b89d10
* NEW: Callouts of name      (*name) (*name\[tag]{args...})
Packit b89d10
* NEW: Builtin callouts  (*FAIL) (*MISMATCH) (*ERROR{n}) (*COUNT) (*MAX{n}) etc..
Packit b89d10
* Examples of Callouts program: [callout.c](sample/callout.c), [count.c](sample/count.c), [echo.c](sample/echo.c)
Packit b89d10
Packit b89d10
(* Callout function API is experimental level and isn't fixed definitely yet. Undocumented now)
Packit b89d10
Packit b89d10
Packit b89d10
New feature of version 6.7.1
Packit b89d10
--------------------------
Packit b89d10
Packit b89d10
* NEW: Mechanism of retry-limit-in-match (* disabled by default)
Packit b89d10
Packit b89d10
Packit b89d10
New feature of version 6.7.0
Packit b89d10
--------------------------
Packit b89d10
Packit b89d10
* NEW: hexadecimal codepoint \uHHHH
Packit b89d10
* NEW: add ONIG_SYNTAX_ONIGURUMA (== ONIG_SYNTAX_DEFAULT)
Packit b89d10
* Disabled \N and \O on ONIG_SYNTAX_RUBY
Packit b89d10
* Reduced size of object file
Packit b89d10
Packit b89d10
Packit b89d10
New feature of version 6.6.0
Packit b89d10
--------------------------
Packit b89d10
Packit b89d10
* NEW: ASCII only mode options for character type/property (?WDSP)
Packit b89d10
* NEW: Extended Grapheme Cluster boundary \y, \Y (*original)
Packit b89d10
* NEW: Extended Grapheme Cluster \X
Packit b89d10
* Range-clear (Absent-clear) operator restores previous range in retractions.
Packit b89d10
Packit b89d10
Packit b89d10
New feature of version 6.5.0
Packit b89d10
--------------------------
Packit b89d10
Packit b89d10
* NEW: \K (keep)
Packit b89d10
* NEW: \R (general newline) \N (no newline)
Packit b89d10
* NEW: \O (true anychar)
Packit b89d10
* NEW: if-then-else   (?(...)...\|...)
Packit b89d10
* NEW: Backreference validity checker (?(xxx)) (*original)
Packit b89d10
* NEW: Absent repeater (?~absent)  \[is equal to (?\~\|absent|\O*)]
Packit b89d10
* NEW: Absent expression   (?~|absent|expr)  (*original)
Packit b89d10
* NEW: Absent stopper (?~|absent)     (*original)
Packit b89d10
Packit b89d10
Packit b89d10
New feature of version 6.4.0
Packit b89d10
--------------------------
Packit b89d10
Packit b89d10
* Fix fatal problem of endless repeat on Windows
Packit b89d10
* NEW: call zero (call the total regexp) \g<0>
Packit b89d10
* NEW: relative backref/call by positive number \k<+n>, \g<+n>
Packit b89d10
Packit b89d10
Packit b89d10
New feature of version 6.3.0
Packit b89d10
--------------------------
Packit b89d10
Packit b89d10
* NEW: octal codepoint \o{.....}
Packit b89d10
* Fixed CVE-2017-9224
Packit b89d10
* Fixed CVE-2017-9225
Packit b89d10
* Fixed CVE-2017-9226
Packit b89d10
* Fixed CVE-2017-9227
Packit b89d10
* Fixed CVE-2017-9228
Packit b89d10
* Fixed CVE-2017-9229
Packit b89d10
Packit b89d10
Packit b89d10
New feature of version 6.1.2
Packit b89d10
--------------------------
Packit b89d10
Packit b89d10
* allow word bound, word begin and word end in look-behind.
Packit b89d10
* NEW option: ONIG_OPTION_CHECK_VALIDITY_OF_STRING
Packit b89d10
Packit b89d10
New feature of version 6.1
Packit b89d10
--------------------------
Packit b89d10
Packit b89d10
* improved doc/RE
Packit b89d10
* NEW API: onig_scan()
Packit b89d10
Packit b89d10
New feature of version 6.0
Packit b89d10
--------------------------
Packit b89d10
Packit b89d10
* Update Unicode 8.0 Property/Case-folding
Packit b89d10
* NEW API: onig_unicode_define_user_property()
Packit b89d10
Packit b89d10
Packit b89d10
License
Packit b89d10
-------
Packit b89d10
Packit b89d10
  BSD license.
Packit b89d10
Packit b89d10
Packit b89d10
Install
Packit b89d10
-------
Packit b89d10
Packit b89d10
### Case 1: Unix and Cygwin platform
Packit b89d10
Packit b89d10
   1. autoreconf -vfi   (* case: configure script is not found.)
Packit b89d10
Packit b89d10
   2. ./configure
Packit b89d10
   3. make
Packit b89d10
   4. make install
Packit b89d10
Packit b89d10
   * uninstall
Packit b89d10
Packit b89d10
     make uninstall
Packit b89d10
Packit b89d10
   * configuration check
Packit b89d10
Packit b89d10
     onig-config --cflags
Packit b89d10
     onig-config --libs
Packit b89d10
     onig-config --prefix
Packit b89d10
     onig-config --exec-prefix
Packit b89d10
Packit b89d10
Packit b89d10
Packit b89d10
### Case 2: Windows 64/32bit platform (Visual Studio)
Packit b89d10
Packit b89d10
   execute make_win64 or make_win32
Packit b89d10
Packit b89d10
      onig_s.lib:  static link library
Packit b89d10
      onig.dll:    dynamic link library
Packit b89d10
Packit b89d10
   * test (ASCII/Shift_JIS)
Packit b89d10
Packit b89d10
      1. cd src
Packit b89d10
      2. copy ..\windows\testc.c .
Packit b89d10
      3. nmake -f Makefile.windows ctest
Packit b89d10
Packit b89d10
   (I have checked by Visual Studio Community 2015)
Packit b89d10
Packit b89d10
Packit b89d10
Packit b89d10
Regular Expressions
Packit b89d10
-------------------
Packit b89d10
Packit b89d10
  See [doc/RE](doc/RE) or [doc/RE.ja](doc/RE.ja) for Japanese.
Packit b89d10
Packit b89d10
Packit b89d10
Usage
Packit b89d10
-----
Packit b89d10
Packit b89d10
  Include oniguruma.h in your program. (Oniguruma API)
Packit b89d10
  See doc/API for Oniguruma API.
Packit b89d10
Packit b89d10
  If you want to disable UChar type (== unsigned char) definition
Packit b89d10
  in oniguruma.h, define ONIG_ESCAPE_UCHAR_COLLISION and then 
Packit b89d10
  include oniguruma.h.
Packit b89d10
Packit b89d10
  If you want to disable regex_t type definition in oniguruma.h,
Packit b89d10
  define ONIG_ESCAPE_REGEX_T_COLLISION and then include oniguruma.h.
Packit b89d10
Packit b89d10
  Example of the compiling/linking command line in Unix or Cygwin,
Packit b89d10
  (prefix == /usr/local case)
Packit b89d10
Packit b89d10
    cc sample.c -L/usr/local/lib -lonig
Packit b89d10
Packit b89d10
Packit b89d10
  If you want to use static link library(onig_s.lib) in Win32,
Packit b89d10
  add option -DONIG_EXTERN=extern to C compiler.
Packit b89d10
Packit b89d10
Packit b89d10
Packit b89d10
Sample Programs
Packit b89d10
---------------
Packit b89d10
Packit b89d10
|File                  |Description                               |
Packit b89d10
|:---------------------|:-----------------------------------------|
Packit b89d10
|sample/simple.c       |example of the minimum (Oniguruma API)    |
Packit b89d10
|sample/names.c        |example of the named group callback.      |
Packit b89d10
|sample/encode.c       |example of some encodings.                |
Packit b89d10
|sample/listcap.c      |example of the capture history.           |
Packit b89d10
|sample/posix.c        |POSIX API sample.                         |
Packit b89d10
|sample/scan.c         |example of using onig_scan().             |
Packit b89d10
|sample/sql.c          |example of the variable meta characters.  |
Packit b89d10
|sample/user_property.c|example of user defined Unicode property. |
Packit b89d10
|sample/callout.c      |example of callouts.                      |
Packit b89d10
Packit b89d10
Packit b89d10
Test Programs
Packit b89d10
Packit b89d10
|File               |Description                            |
Packit b89d10
|:------------------|:--------------------------------------|
Packit b89d10
|sample/syntax.c    |Perl, Java and ASIS syntax test.       |
Packit b89d10
|sample/crnl.c      |--enable-crnl-as-line-terminator test  |
Packit b89d10
Packit b89d10
Packit b89d10
Packit b89d10
Source Files
Packit b89d10
------------
Packit b89d10
Packit b89d10
|File               |Description                                             |
Packit b89d10
|:------------------|:-------------------------------------------------------|
Packit b89d10
|oniguruma.h        |Oniguruma API header file (public)                      |
Packit b89d10
|onig-config.in     |configuration check program template                    |
Packit b89d10
|regenc.h           |character encodings framework header file               |
Packit b89d10
|regint.h           |internal definitions                                    |
Packit b89d10
|regparse.h         |internal definitions for regparse.c and regcomp.c       |
Packit b89d10
|regcomp.c          |compiling and optimization functions                    |
Packit b89d10
|regenc.c           |character encodings framework                           |
Packit b89d10
|regerror.c         |error message function                                  |
Packit b89d10
|regext.c           |extended API functions (deluxe version API)             |
Packit b89d10
|regexec.c          |search and match functions                              |
Packit b89d10
|regparse.c         |parsing functions.                                      |
Packit b89d10
|regsyntax.c        |pattern syntax functions and built-in syntax definitions|
Packit b89d10
|regtrav.c          |capture history tree data traverse functions            |
Packit b89d10
|regversion.c       |version info function                                   |
Packit b89d10
|st.h               |hash table functions header file                        |
Packit b89d10
|st.c               |hash table functions                                    |
Packit b89d10
|oniggnu.h          |GNU regex API header file (public)                      |
Packit b89d10
|reggnu.c           |GNU regex API functions                                 |
Packit b89d10
|onigposix.h        |POSIX API header file (public)                          |
Packit b89d10
|regposerr.c        |POSIX error message function                            |
Packit b89d10
|regposix.c         |POSIX API functions                                     |
Packit b89d10
|mktable.c          |character type table generator                          |
Packit b89d10
|ascii.c            |ASCII encoding                                          |
Packit b89d10
|euc_jp.c           |EUC-JP encoding                                         |
Packit b89d10
|euc_tw.c           |EUC-TW encoding                                         |
Packit b89d10
|euc_kr.c           |EUC-KR, EUC-CN encoding                                 |
Packit b89d10
|sjis.c             |Shift_JIS encoding                                      |
Packit b89d10
|big5.c             |Big5      encoding                                      |
Packit b89d10
|gb18030.c          |GB18030   encoding                                      |
Packit b89d10
|koi8.c             |KOI8      encoding                                      |
Packit b89d10
|koi8_r.c           |KOI8-R    encoding                                      |
Packit b89d10
|cp1251.c           |CP1251    encoding                                      |
Packit b89d10
|iso8859_1.c        |ISO-8859-1 (Latin-1)                                    |
Packit b89d10
|iso8859_2.c        |ISO-8859-2 (Latin-2)                                    |
Packit b89d10
|iso8859_3.c        |ISO-8859-3 (Latin-3)                                    |
Packit b89d10
|iso8859_4.c        |ISO-8859-4 (Latin-4)                                    |
Packit b89d10
|iso8859_5.c        |ISO-8859-5 (Cyrillic)                                   |
Packit b89d10
|iso8859_6.c        |ISO-8859-6 (Arabic)                                     |
Packit b89d10
|iso8859_7.c        |ISO-8859-7 (Greek)                                      |
Packit b89d10
|iso8859_8.c        |ISO-8859-8 (Hebrew)                                     |
Packit b89d10
|iso8859_9.c        |ISO-8859-9 (Latin-5 or Turkish)                         |
Packit b89d10
|iso8859_10.c       |ISO-8859-10 (Latin-6 or Nordic)                         |
Packit b89d10
|iso8859_11.c       |ISO-8859-11 (Thai)                                      |
Packit b89d10
|iso8859_13.c       |ISO-8859-13 (Latin-7 or Baltic Rim)                     |
Packit b89d10
|iso8859_14.c       |ISO-8859-14 (Latin-8 or Celtic)                         |
Packit b89d10
|iso8859_15.c       |ISO-8859-15 (Latin-9 or West European with Euro)        |
Packit b89d10
|iso8859_16.c       |ISO-8859-16 (Latin-10)                                  |
Packit b89d10
|utf8.c             |UTF-8    encoding                                       |
Packit b89d10
|utf16_be.c         |UTF-16BE encoding                                       |
Packit b89d10
|utf16_le.c         |UTF-16LE encoding                                       |
Packit b89d10
|utf32_be.c         |UTF-32BE encoding                                       |
Packit b89d10
|utf32_le.c         |UTF-32LE encoding                                       |
Packit b89d10
|unicode.c          |common codes of Unicode encoding                        |
Packit b89d10
|unicode_fold_data.c|Unicode folding data                                    |
Packit b89d10
|win32/Makefile     |Makefile for Win32 (VC++)                               |
Packit b89d10
|win32/config.h     |config.h for Win32                                      |