Blame os400/iconv/README.iconv

Packit 423ecb
IBM OS/400 implements iconv in an odd way:
Packit 423ecb
- Type iconv_t is a structure: therefore objects of this type cannot be
Packit 423ecb
  compared to (iconv_t) -1.
Packit 423ecb
- Supported character sets names are all of the form IBMCCSIDccsid..., where
Packit 423ecb
  ccsid is a decimal 5-digit integer identifying an IBM coded character set.
Packit 423ecb
  In addition, character set names have to be given in EBCDIC.
Packit 423ecb
  Standard character set names like "UTF-8" are NOT recognized.
Packit 423ecb
- The prototype of iconv_open() does not declare parameters as const, although
Packit 423ecb
  they are not altered.
Packit 423ecb
Packit 423ecb
 Since libiconv does not support EBCDIC, use of this package here as a
Packit 423ecb
replacement is not a solution.
Packit 423ecb
Packit 423ecb
 For these reasons, the code in this directory implements a wrapper to the
Packit 423ecb
OS/400 iconv implementation. The wrapper performs the following transformations:
Packit 423ecb
- Type iconv_t is an pointer. Although OS/400 pointers are odd, comparing
Packit 423ecb
  with (iconv_t) -1 is OK.
Packit 423ecb
- All IANA character set names are recognized in a coding- and case-insensitive
Packit 423ecb
  way, providing an equivalent CCSID exists. see
Packit 423ecb
  http://www.iana.org/assignments/character-sets/character-sets.xhtml
Packit 423ecb
- All CCSIDs from the association file can be expressed as IBMCCSIDxxxxx where
Packit 423ecb
  xxxxx is the 5 digit CCSID; no null terminator is required. Alternate codes
Packit 423ecb
  are of the form ibm-xxx (null-terminated), where xxx is the integer CCSID with
Packit 423ecb
  leading zeroes stripped.
Packit 423ecb
- If a IANA BIBenum is defined for a CCSID, the name iana-xxx can be used,
Packit 423ecb
  where xxx is the integer MIBenum without leading zeroes.
Packit 423ecb
- In addition, some aliases are also taken from the association file. Examples
Packit 423ecb
  are: ASCII, EBCDIC, UTF8.
Packit 423ecb
- Prototype of iconv_open() has const parameters.
Packit 423ecb
- Character code names can be given in any code.
Packit 423ecb
Packit 423ecb
Character set names to CCSID conversion.
Packit 423ecb
- http://www.iana.org/assignments/character-sets/character-sets.xhtml provides
Packit 423ecb
  all IANA registered character set names and aliases associated with a
Packit 423ecb
  MIBenum, that is a unique character set identifier.
Packit 423ecb
- A hand-maintained file ccsid_mibenum.xml associates IBM CCSIDs to
Packit 423ecb
  IANA MBenums.
Packit 423ecb
- An OS/400 C program (in subdirectory bldcsndfa) generates a deterministic
Packit 423ecb
  finite automaton from the files mentioned above into a C file for all
Packit 423ecb
  possible character set name and associating each of them with its
Packit 423ecb
  corresponding CCSID. This program can only be run on OS/400 since it uses
Packit 423ecb
  the native iconv support for EBCDIC.
Packit 423ecb
- Since these operations are tedious and the table generation needs bootstraping
Packit 423ecb
  with libxml2, the generated automaton is stored within sources and need not
Packit 423ecb
  be rebuilt at each compilation. However, source is provided here to allow
Packit 423ecb
  new table generation with conversion tables that were not available at the
Packit 423ecb
  time of original generation.