Blame docs/README.chartrans

Packit f574b8
Lynx CHARTRANS
Packit f574b8
Packit f574b8
 Features (in addition to those which Lynx 2.7.1 already has):
Packit f574b8
Packit f574b8
 - Can (attempt to) translate from any document charset to any display
Packit f574b8
   character set, *IF* the document charset is known by a translation
Packit f574b8
   table (compiled in at installation).
Packit f574b8
Packit f574b8
 - New method to define character sets: used for input charset as well
Packit f574b8
   as display character set, translation tables compiled in from
Packit f574b8
   separate files (one per charset).  One table is designated as default
Packit f574b8
   and can be used for fallback translation to 7-bit replacements for
Packit f574b8
   display.
Packit f574b8
Packit f574b8
 - New method for specifying translations of SGML entities.
Packit f574b8
Packit f574b8
 - Unicode (UTF-8) support: can (attempt to) decode and translate UTF-8 to
Packit f574b8
   display character set, or pass through UTF to display (if terminal
Packit f574b8
   or console understands UTF-8).  [raw display of UTF only tested with Slang
Packit f574b8
   so far, does not always position everything correctly on screen]
Packit f574b8
Packit f574b8
 - Support for CHARSET attribute on A tag (and sometimes LINK), as in HTML
Packit f574b8
   i18n RFC 2070 and W3C HTML 4.0 drafts.  A link can suggest the target's
Packit f574b8
   charset in this way.
Packit f574b8
Packit f574b8
 - Support for ACCEPT-CHARSET attribute of FORM tags.
Packit f574b8
Packit f574b8
 - EXPERIMENTAL, currently enabled only for Linux console:
Packit f574b8
   can (attempt to) automatically switch terminal mode and load new
Packit f574b8
   code pages on change of display character set.
Packit f574b8
Packit f574b8
 - some minor changes: sometimes invalid characters were displayed in a hex
Packit f574b8
   notation Uxxxx (helps debugging, but I also regard it as at least not
Packit f574b8
   worse than showing the wrong char without warning), now they are not
Packit f574b8
   displayed to reduce garbage.
Packit f574b8
Packit f574b8
Additions/changes to user interface:
Packit f574b8
Packit f574b8
 - many new Display Character Sets are available on O)ptions screen.
Packit f574b8
   (One can use arrow keys, HOME, END etc. for cycling through the list
Packit f574b8
   or use selection from popup box, as for other options.)
Packit f574b8
Packit f574b8
 - new command line flags:
Packit f574b8
   -assume_charset=...  assume this as charset for documents that don't
Packit f574b8
                        specify a charset parameter in HTTP headers
Packit f574b8
   -assume_local_charset=...  assume this as charset of local file
Packit f574b8
   -assume_unrec_charset=...  in case a charset parameter is not recognized;
Packit f574b8
   docs also available as ASSUME_CHARSET etc. in lynx.cfg
Packit f574b8
   In "Advanced User" mode, ASSUME_CHARSET can be changed during a session
Packit f574b8
   from the Options Screen.
Packit f574b8
Packit f574b8
 - The "Raw" toggle (from -raw flag, '@' key, or Options screen)
Packit f574b8
   o  toggles the assumption "Default remote charset is same as Display
Packit f574b8
      Character Set" on or off.
Packit f574b8
      Toggling of the assumed charset is between Display Character Set and
Packit f574b8
      the specified ASSUME_CHARSET or, if they are the same, between the
Packit f574b8
      specified ASSUME_CHARSET and ISO-8859-1.
Packit f574b8
   o  The default for raw mode now depends on the Display Character Set as
Packit f574b8
      well as on the specified ASSUME_CHARSET value.
Packit f574b8
   o  should work as before for CJK charsets (turning CJK-mode on or off).
Packit f574b8
   o  If the effective ASSUME_CHARSET and the Display Character Set are
Packit f574b8
      unchanged from the ISO-8859-1 default, toggling "Raw" may have some
Packit f574b8
      additional effect for characters that can't be translated.
Packit f574b8
   (Try the "Transparent" Display Character Set for more "rawness".)
Packit f574b8
Packit f574b8
Packit f574b8
Requirements:  same as for Lynx in general :)
Packit f574b8
Packit f574b8
The chartrans code is now merged with Wayne Buttle's changes for
Packit f574b8
32-bit MS Windows and DOS/DJGPP, with Thomas Dickey's and Jim Spath's
Packit f574b8
emerging auto-configure mechanism, and with BUGFIXES from Foteos
Packit f574b8
Macrides.  See the accompanying file CHANGES for the current
Packit f574b8
status.
Packit f574b8
Packit f574b8
Packit f574b8
A warning:
Packit f574b8
In some cases undisplayable bytes may still get sent to the terminal
Packit f574b8
which are then interpreted as control chars, there is no protection
Packit f574b8
against if strange things are defined in the table files.
Packit f574b8
Packit f574b8
Packit f574b8
HOW TO INSTALL:
Packit f574b8
Packit f574b8
(4) before compiling:
Packit f574b8
Packit f574b8
    Check top level makefile or Makefile and userdefs.h as usual.
Packit f574b8
Packit f574b8
    NOTE that there is a new "#define" in userdefs.h for MAX_CHARSETS
Packit f574b8
    near the end (in "Section 3.").
Packit f574b8
Packit f574b8
(5) Building Lynx:
Packit f574b8
Packit f574b8
    Compiling the chartrans code is now integrated into the normal
Packit f574b8
    installation procedures for UNIX (configure script) and other
Packit f574b8
    platforms.
Packit f574b8
Packit f574b8
    What's supposed to happen (in addition to the usual things when
Packit f574b8
    building Lynx): in the new subdirectory src/chrtrans, make should
Packit f574b8
    first compile the auxiliary program `makeuctb', then invoke that
Packit f574b8
    program to create xxxxx_yyy.h files from the provided xxxxx_yyy.tab
Packit f574b8
    translation table files.  (See README.* files in src/chrtrans for
Packit f574b8
    more info.)
Packit f574b8
Packit f574b8
    If all goes well, just invoking make from the top-level Lynx dir
Packit f574b8
    as usual should do everything automatically.  If not, the makefiles
Packit f574b8
    may need some tweaking... or:
Packit f574b8
Packit f574b8
(6) Some things to look at if compilation fails:
Packit f574b8
Packit f574b8
    In src/chrtrans/UCkd.h there is a typedef for an unsigned 16bit
Packit f574b8
    numeric type which may need to be changed for your system.
Packit f574b8
    See comment near top there.
Packit f574b8
Packit f574b8
    For recompiling Lynx, `make clean' should not be necessary if only
Packit f574b8
    files in src/chrtrans have been changed.  On the other hand
Packit f574b8
    may not propagate to the src/chrtrans directory (depending how things
Packit f574b8
    are going with auto-config), you may have to cd to that directory
Packit f574b8
    and `make clean' there to really clean up there.
Packit f574b8
Packit f574b8
(7) To customize (add/change translation tables etc.):
Packit f574b8
Packit f574b8
     See README.* files in src/chrtrans.
Packit f574b8
     Make the necessary changes there, then recompile.
Packit f574b8
     (A general `make clean' should not be necessary, but make sure
Packit f574b8
     the ...uni.h file in src/chrtrans gets regenerated.)
Packit f574b8
Packit f574b8
     Note that definition of new character entities (if e.g., you want
Packit f574b8
     Lynx to recognize Ž) are not covered by these table files,
Packit f574b8
     they have to be listed in entities.h.
Packit f574b8
Packit f574b8
     _If you are on a Linux system_ and using Lynx on the console (i.e.
Packit f574b8
     not xterm, not a dialup *into* the Linux box), you can compile
Packit f574b8
     with -DEXP_CHARTRANS_AUTOSWITCH.  This is very useful for testing
Packit f574b8
     the various Display Character Sets, Lynx will try to automatically
Packit f574b8
     change the console state.  You need to have the Linux kbd package
Packit f574b8
     installed, with a working `setfont' command executable by the user,
Packit f574b8
     and the right font files - check the source in src/UCAuto.c for
Packit f574b8
     the files used and/or to change them!
Packit f574b8
     NOTE that with this enabled,
Packit f574b8
     - Lynx currently will not clean up the console state at exit,
Packit f574b8
       it will probably left like the last Display Character Set you used.
Packit f574b8
     - Loading a font is global across _all_ virtual text consoles, so
Packit f574b8
       using Lynx (compiled with this flag) may change the appearance of
Packit f574b8
       text on other consoles (if that text contains characters
Packit f574b8
       beyond US-ASCII).
Packit f574b8
Packit f574b8
(8) Some suggested Web pages for testing:
Packit f574b8
Packit f574b8
    <URL:  http://www.tezcat.com/~kweide/lynx-chartrans/test/>
Packit f574b8
Packit f574b8
    <URL:  http://www.isoc.org:8080/>,
Packit f574b8
      especially
Packit f574b8
    <URL:  http://www.isoc.org:8080/liste_ml.htm>.
Packit f574b8
Packit f574b8
    <URL:  http://www.accentsoft.com/un/un-all.htm>
Packit f574b8
Packit f574b8
(9) Please report bugs, unexpected behavior, etc.
Packit f574b8
    to <lynx-dev@nongnu.org>.
Packit f574b8
Packit f574b8
    Suggestions for improvement would be welcome, as well as
Packit f574b8
    contributed translation tables (for stuff that is not available
Packit f574b8
    at ftp://dkuug.dk or ftp://ftp.unicode.org).
Packit f574b8
Packit f574b8
KW  1997-11-06