Blob Blame History Raw
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta name="generator" content=
"HTML Tidy for Linux/x86 (vers 1 September 2005), see www.w3.org" />
<meta http-equiv="Content-Type" content=
"text/html; charset=us-ascii" />
<title>docbook2X: utf8trans</title>
<link rel="stylesheet" href="docbook2X.css" type="text/css" />
<link rev="made" href="mailto:stevecheng@users.sourceforge.net" />
<meta name="generator" content="DocBook XSL Stylesheets V1.68.1" />
<link rel="start" href="docbook2X.html" title=
"docbook2X: Documentation Table of Contents" />
<link rel="up" href="charsets.html" title=
"docbook2X: Character set conversion" />
<link rel="prev" href="charsets.html" title=
"docbook2X: Character set conversion" />
<link rel="next" href="faq.html" title="docbook2X: FAQ" />
</head>
<body>
<div class="navheader">
<table width="100%" summary="Navigation header">
<tr>
<th colspan="3" align="center"><span><strong class=
"command">utf8trans</strong></span></th>
</tr>
<tr>
<td width="20%" align="left"><a accesskey="p" href=
"charsets.html">&lt;&lt; Previous</a>&nbsp;</td>
<th width="60%" align="center">Character set conversion</th>
<td width="20%" align="right">&nbsp;<a accesskey="n" href=
"faq.html">Next &gt;&gt;</a></td>
</tr>
</table>
<hr /></div>
<div class="refentry" lang="en" xml:lang="en"><a id="utf8trans"
name="utf8trans"></a>
<div class="titlepage"></div>
<a id="id2538852" class="indexterm" name="id2538852"></a><a id=
"id2538859" class="indexterm" name="id2538859"></a><a id=
"id2538866" class="indexterm" name="id2538866"></a><a id=
"id2538873" class="indexterm" name="id2538873"></a><a id=
"id2538883" class="indexterm" name="id2538883"></a><a id=
"id2538890" class="indexterm" name="id2538890"></a>
<div class="refnamediv">
<h2>Name</h2>
<p><span><strong class="command">utf8trans</strong></span> &mdash;
Transliterate UTF-8 characters according to a table</p>
</div>
<div class="refsynopsisdiv">
<h2>Synopsis</h2>
<div class="cmdsynopsis">
<p><code class="command">utf8trans</code> <em class=
"replaceable"><code>charmap</code></em> [<em class=
"replaceable"><code>file</code></em>...]</p>
</div>
</div>
<div class="refsect1" lang="en" xml:lang="en"><a id="id2538961"
name="id2538961"></a>
<h2>Description</h2>
<a id="id2538967" class="indexterm" name="id2538967"></a>
<p><span><strong class="command">utf8trans</strong></span>
transliterates characters in the specified files (or standard
input, if they are not specified) and writes the output to standard
output. All input and output is in the UTF-8 encoding.</p>
<p>This program is usually used to render characters in Unicode
text files as some markup escapes or ASCII transliterations. (It is
not intended for general charset conversions.) It provides
functionality similar to the character maps in XSLT 2.0 (XML
Stylesheet Language &ndash; Transformations, version 2.0).</p>
</div>
<div class="refsect1" lang="en" xml:lang="en"><a id="id2539001"
name="id2539001"></a>
<h2>Options</h2>
<div class="variablelist">
<dl>
<dt><span class="term"><code class="option">-m</code>,</span>
<span class="term"><code class="option">--modify</code></span></dt>
<dd>
<p>Modifies the given files in-place with their transliterated
output, instead of sending it to standard output.</p>
<p>This option is useful for efficient transliteration of many
files at once.</p>
</dd>
<dt><span class="term"><code class=
"option">--help</code></span></dt>
<dd>
<p>Show brief usage information and exit.</p>
</dd>
<dt><span class="term"><code class=
"option">--version</code></span></dt>
<dd>
<p>Show version and exit.</p>
</dd>
</dl>
</div>
</div>
<div class="refsect1" lang="en" xml:lang="en"><a id="id2539071"
name="id2539071"></a>
<h2>Usage</h2>
<p>The translation is done according to the rules in the
&ldquo;<span class="quote">character map</span>&rdquo;, named in
the file <em class="replaceable"><code>charmap</code></em>. It has
the following format:</p>
<div class="orderedlist">
<ol type="1">
<li>
<p>Each line represents a translation entry, except for blank lines
and comment lines, which are ignored.</p>
</li>
<li>
<p>Any amount of whitespace (space or tab) may precede the start of
an entry.</p>
</li>
<li>
<p>Comment lines begin with <code class="literal">#</code>.
Everything on the same line is ignored.</p>
</li>
<li>
<p>Each entry consists of the Unicode codepoint of the character to
translate, in hexadecimal, followed <span class=
"emphasis"><em>one</em></span> space or tab, followed by the
translation string, up to the end of the line.</p>
</li>
<li>
<p>The translation string is taken literally, including any leading
and trailing spaces (except the delimeter between the codepoint and
the translation string), and all types of characters. The newline
at the end is not included.</p>
</li>
</ol>
</div>
<p>The above format is intended to be restrictive, to keep
<span><strong class="command">utf8trans</strong></span> simple. But
if a XML-based format is desired, there is a <code class=
"filename">xmlcharmap2utf8trans</code> script that comes with the
docbook2X distribution, that converts character maps in XSLT 2.0
format to the <span><strong class=
"command">utf8trans</strong></span> format.</p>
</div>
<div class="refsect1" lang="en" xml:lang="en"><a id="id2539164"
name="id2539164"></a>
<h2>Limitations</h2>
<div class="itemizedlist">
<ul>
<li>
<p><span><strong class="command">utf8trans</strong></span> does not
work with binary files, because malformed UTF-8 sequences in the
input are substituted with U+FFFD characters. However, null
characters in the input are handled correctly. This limitation may
be removed in the future.</p>
</li>
<li>
<p>There is no way to include a newline or null in the substitution
string.</p>
</li>
</ul>
</div>
</div>
</div>
<div class="navfooter">
<hr />
<table width="100%" summary="Navigation footer">
<tr>
<td width="40%" align="left"><a accesskey="p" href=
"charsets.html">&lt;&lt; Previous</a>&nbsp;</td>
<td width="20%" align="center"><a accesskey="u" href=
"charsets.html">Up</a></td>
<td width="40%" align="right">&nbsp;<a accesskey="n" href=
"faq.html">Next &gt;&gt;</a></td>
</tr>
<tr>
<td width="40%" align="left" valign="top">Character set
conversion&nbsp;</td>
<td width="20%" align="center"><a accesskey="h" href=
"docbook2X.html">Table of Contents</a></td>
<td width="40%" align="right" valign="top">&nbsp;FAQ</td>
</tr>
</table>
</div>
<p class="footer-homepage"><a href=
"http://docbook2x.sourceforge.net/" title=
"docbook2X: Home page">docbook2X home page</a></p>
</body>
</html>