Blame doc/utf8trans.html

Packit e4b6da
Packit e4b6da
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
Packit e4b6da
<html xmlns="http://www.w3.org/1999/xhtml">
Packit e4b6da
<head>
Packit e4b6da
Packit e4b6da
"HTML Tidy for Linux/x86 (vers 1 September 2005), see www.w3.org" />
Packit e4b6da
Packit e4b6da
"text/html; charset=us-ascii" />
Packit e4b6da
<title>docbook2X: utf8trans</title>
Packit e4b6da
<link rel="stylesheet" href="docbook2X.css" type="text/css" />
Packit e4b6da
<link rev="made" href="mailto:stevecheng@users.sourceforge.net" />
Packit e4b6da
<meta name="generator" content="DocBook XSL Stylesheets V1.68.1" />
Packit e4b6da
Packit e4b6da
"docbook2X: Documentation Table of Contents" />
Packit e4b6da
Packit e4b6da
"docbook2X: Character set conversion" />
Packit e4b6da
Packit e4b6da
"docbook2X: Character set conversion" />
Packit e4b6da
<link rel="next" href="faq.html" title="docbook2X: FAQ" />
Packit e4b6da
</head>
Packit e4b6da
<body>
Packit e4b6da
Packit e4b6da
Packit e4b6da
Packit e4b6da
Packit e4b6da
"command">utf8trans
Packit e4b6da
Packit e4b6da
Packit e4b6da
Packit e4b6da
"charsets.html"><< Previous 
Packit e4b6da
Character set conversion
Packit e4b6da
 
Packit e4b6da
"faq.html">Next >>
Packit e4b6da
Packit e4b6da
Packit e4b6da

Packit e4b6da
Packit e4b6da
name="utf8trans">
Packit e4b6da
Packit e4b6da
Packit e4b6da
"id2538859" class="indexterm" name="id2538859">
Packit e4b6da
"id2538866" class="indexterm" name="id2538866">
Packit e4b6da
"id2538873" class="indexterm" name="id2538873">
Packit e4b6da
"id2538883" class="indexterm" name="id2538883">
Packit e4b6da
"id2538890" class="indexterm" name="id2538890">
Packit e4b6da
Packit e4b6da

Name

Packit e4b6da

utf8trans

Packit e4b6da
Transliterate UTF-8 characters according to a table

Packit e4b6da
Packit e4b6da
Packit e4b6da

Synopsis

Packit e4b6da
Packit e4b6da

utf8trans

Packit e4b6da
"replaceable">charmap [
Packit e4b6da
"replaceable">file...]

Packit e4b6da
Packit e4b6da
Packit e4b6da
Packit e4b6da
name="id2538961">
Packit e4b6da

Description

Packit e4b6da
Packit e4b6da

utf8trans

Packit e4b6da
transliterates characters in the specified files (or standard
Packit e4b6da
input, if they are not specified) and writes the output to standard
Packit e4b6da
output. All input and output is in the UTF-8 encoding.

Packit e4b6da

This program is usually used to render characters in Unicode

Packit e4b6da
text files as some markup escapes or ASCII transliterations. (It is
Packit e4b6da
not intended for general charset conversions.) It provides
Packit e4b6da
functionality similar to the character maps in XSLT 2.0 (XML
Packit e4b6da
Stylesheet Language – Transformations, version 2.0).

Packit e4b6da
Packit e4b6da
Packit e4b6da
name="id2539001">
Packit e4b6da

Options

Packit e4b6da
Packit e4b6da
Packit e4b6da
-m,
Packit e4b6da
--modify
Packit e4b6da
Packit e4b6da

Modifies the given files in-place with their transliterated

Packit e4b6da
output, instead of sending it to standard output.

Packit e4b6da

This option is useful for efficient transliteration of many

Packit e4b6da
files at once.

Packit e4b6da
Packit e4b6da
Packit e4b6da
"option">--help
Packit e4b6da
Packit e4b6da

Show brief usage information and exit.

Packit e4b6da
Packit e4b6da
Packit e4b6da
"option">--version
Packit e4b6da
Packit e4b6da

Show version and exit.

Packit e4b6da
Packit e4b6da
Packit e4b6da
Packit e4b6da
Packit e4b6da
Packit e4b6da
name="id2539071">
Packit e4b6da

Usage

Packit e4b6da

The translation is done according to the rules in the

Packit e4b6da
character map”, named in
Packit e4b6da
the file charmap. It has
Packit e4b6da
the following format:

Packit e4b6da
Packit e4b6da
    Packit e4b6da
  1. Packit e4b6da

    Each line represents a translation entry, except for blank lines

    Packit e4b6da
    and comment lines, which are ignored.

    Packit e4b6da
    Packit e4b6da
  2. Packit e4b6da

    Any amount of whitespace (space or tab) may precede the start of

    Packit e4b6da
    an entry.

    Packit e4b6da
    Packit e4b6da
  3. Packit e4b6da

    Comment lines begin with #.

    Packit e4b6da
    Everything on the same line is ignored.

    Packit e4b6da
    Packit e4b6da
  4. Packit e4b6da

    Each entry consists of the Unicode codepoint of the character to

    Packit e4b6da
    translate, in hexadecimal, followed 
    Packit e4b6da
    "emphasis">one space or tab, followed by the
    Packit e4b6da
    translation string, up to the end of the line.

    Packit e4b6da
    Packit e4b6da
  5. Packit e4b6da

    The translation string is taken literally, including any leading

    Packit e4b6da
    and trailing spaces (except the delimeter between the codepoint and
    Packit e4b6da
    the translation string), and all types of characters. The newline
    Packit e4b6da
    at the end is not included.

    Packit e4b6da
    Packit e4b6da
    Packit e4b6da
    Packit e4b6da

    The above format is intended to be restrictive, to keep

    Packit e4b6da
    utf8trans simple. But
    Packit e4b6da
    if a XML-based format is desired, there is a 
    Packit e4b6da
    "filename">xmlcharmap2utf8trans script that comes with the
    Packit e4b6da
    docbook2X distribution, that converts character maps in XSLT 2.0
    Packit e4b6da
    format to the 
    Packit e4b6da
    "command">utf8trans format.

    Packit e4b6da
    Packit e4b6da
    Packit e4b6da
    name="id2539164">
    Packit e4b6da

    Limitations

    Packit e4b6da
    Packit e4b6da
      Packit e4b6da
    • Packit e4b6da

      utf8trans does not

      Packit e4b6da
      work with binary files, because malformed UTF-8 sequences in the
      Packit e4b6da
      input are substituted with U+FFFD characters. However, null
      Packit e4b6da
      characters in the input are handled correctly. This limitation may
      Packit e4b6da
      be removed in the future.

      Packit e4b6da
      Packit e4b6da
    • Packit e4b6da

      There is no way to include a newline or null in the substitution

      Packit e4b6da
      string.

      Packit e4b6da
      Packit e4b6da
      Packit e4b6da
      Packit e4b6da
      Packit e4b6da
      Packit e4b6da
      Packit e4b6da

      Packit e4b6da
      Packit e4b6da
      Packit e4b6da
      Packit e4b6da
      "charsets.html"><< Previous 
      Packit e4b6da
      Packit e4b6da
      "charsets.html">Up
      Packit e4b6da
       
      Packit e4b6da
      "faq.html">Next >>
      Packit e4b6da
      Packit e4b6da
      Packit e4b6da
      Character set
      Packit e4b6da
      conversion 
      Packit e4b6da
      Packit e4b6da
      "docbook2X.html">Table of Contents
      Packit e4b6da
       FAQ
      Packit e4b6da
      Packit e4b6da
      Packit e4b6da
      Packit e4b6da

      Packit e4b6da
      "http://docbook2x.sourceforge.net/" title=
      Packit e4b6da
      "docbook2X: Home page">docbook2X home page

      Packit e4b6da
      </body>
      Packit e4b6da
      </html>