#============================================================================
# Enca v1.19 (2016-09-05) guess and convert encoding of text files
# Copyright (C) 2000-2003 David Necas (Yeti) <yeti@physics.muni.cz>
# Copyright (C) 2009-2016 Michal Cihar <michal@cihar.com>
#============================================================================
List of user-visible changes in Enca
More detailed log can be obtained from older changelogs or git log.
Legend: + new feature
* change of behaviour (including disappearing of a feature)
- bugfix
enca-1.19-dev
- fix possible memory leak
- make utf-8 detection work even on one character
enca-1.18 2016-01-07
- fix installation of devhelp documentation
enca-1.17 2016-01-04
- Fixed conversion of GB2312 encoding with iconv
- Fixed iconv conversion on OSX
- Documentation improvements
- Fixed execution of external converters with ACLs
- Improved test coverage to 80%
enca-1.16 2014-10-20
- Fixed typo in Belarusian language name
- Added aliases for Chinese and Yugoslavian languages
enca-1.15 2013-09-30
- Documentation improvement
- Development moved to GitHub
- Do not use deprecated autoconf macros
enca-1.14 2012-09-11
- Allow standard names for belarusian and slovenian languages, thanks
to Branislav Geržo for suggestion.
- Reset strictness when check buffer less than file size, thanks to
Sam Liao.
- Fixed typos in man page, thanks to A. Costa.
enca-1.13 2010-02-09
- Reverse usage of temp file while converting using recode to prevent
file truncation (bug #1135).
enca-1.12 2009-10-29
- Fixes some minor memory leaks.
- Fixes little problems in autoconf scripts.
enca-1.11 2009-09-25
- Dropped scanf configure test which is not used at all.
- Fixes some wrong format strings.
enca-1.10 2009-08-25
+ Enca is back alive or at least in maintenance mode.
* Enca now lives in git repository, see <http://gitorious.org/enca>.
- Add missing charset koi8u to belarusian language.
- Fixed some typos in program and documentation.
enca-1.9 2005-12-18
+ support for HZ encoding
* Big5 and GBK detection improved
- enca.spec no longer installs docs to world-unreadable directory
enca-1.8 2005-11-24
+ Chinese (Big5 and GBK) support (thanks to Zuxy)
* deb/ subdirectory is gone as there is finally an Enca package in Debian
(thanks to Michal Cihar)
- manual page clean-up (thanks to Michal Cihar)
enca-1.7 2005-02-27
+ new name type: preferred MIME name (option -m)
- broken iconv detection on some system was fixed
enca-1.6 2004-09-01
* English language names (--list=languages, enca_language_english_name())
were changed to lowercase to match common locale aliases
- Win32, i.e. MinGW and Cygwin, build problems were fixed
enca-1.5 2004-05-30
- crash on impossible recovery after iconv failure in pipe was fixed
- rpm building problems on Mandrake Linux were fixed
enca-1.4 2004-05-12
- dependency of guessing API on locales (via ctype functions) was fixed
- --help text generation failure on some systems was fixed
enca-1.3 2003-12-24
+ [libenca] it's possible to get analyser option values, not just set them
* a good BOM (byte order mark) increases the chance of being recognized for
UCS-4 and UTF-8 too
* external converter wrappers were moved from bin to libexec and the b-
prefix was removed (though it still works)
* external converters are no longer searched in PATH, nonstandard ones
has to be specified with full path
enca-1.2 2003-11-26
- fixed segfault in language detection for some locale setups
enca-1.1 2003-11-17
- fixed losing data at the end of file when using external converters in a
pipe (and maybe in other situations)
- [libenca] enca_analyser_free() not freeing analyser completely was fixed
enca-1.0 2003-11-06
* deprectated options -T, -R, -S, -u, -U, -m, and -M were finally removed
* default HTML API docs installation path changed to the new gtk-doc style
(DATADIR/gtk-doc/html/enca)
* debian/ subdir moved to deb/ to allow official deb creation w/o too much
hassle
enca-0.99.4 2003-07-15
- several race conditions in librecode and iconv interfaces were fixed
- temporary file names are much less predictable now
enca-0.99.3 2003-06-30
* Debian package is back from death
* failure to find external converter is now fatal
- fixed build problems on FreeBSD (and probably other Unices)
- libiconv is not used for `conversion to ASCII' since never does the
Right Thing, whatever it is
- when conversion with libiconv fails, the file should now survive intact
- fixed build problems on systems w/o libiconv (hopefully)
- fixed distclean and uninstall targets to really clean and uninstall
everything
- fixed builds with separate source (read-only) and build directories
- fixed builds with --without-libiconv and --without-librecode on GNU/Linux
- external converter is not checked when it's not going to be used
enca-0.99.2 2003-06-25
+ EOL type is used to decide ambiguous cases, e.g. CP1250 is reported
instead of ISO-8859-2/CRLF
* --list languages by default prints English names, instead of ISO-639a
codes, use -e or -r to get the old listing
* if LC_CTYPE is something like en_US, more locale categories are examined
to detect the language
* cork charset was modified to contain \n, \r and \t in the same places as
ASCII
* some heuristics tuning
enca-0.99.1 2003-06-22
+ libenca pkg-config support
* all libenca tuning parameters (-T, -R, -S, -u, -U, -m, and -M) were
marked deprecated and are noop, Enca should DWIM
* ambiguity is now always OK when the sample has the same meaning in all the
charsets
* deprecated `built-in-encodings' and `encodings' lists were removed
* PAGER feature was removed
- exchanged `latvian' and `lithuanian' language names were fixed (`lv' and
`lt' were always OK)
- missing tests for the new languages was added to the test suit
enca-0.99.0 2003-06-14
+ added some support for: Bulgarian, Croatian, Estonian, Hungarian, Latvian,
Lithuanian, Slovene
+ a new algorithm for 8bit-dense languages (cyrillics), the old one is used
as a fallback
* removed support for non-transitive iconv (such a thing should not exist)
* auxiliary tools in data are not longer built in regular builds,
use --enable-maintainer-mode to rebuild them, create dists, etc.
- fixed iconv interface surface check pickier than iconv itself inhibiting
some otherwise possible conversions
- fixed u+x permissions on temporary files (from 0.10.7)
- fixed not deleting temporary files in iconv interface
- fixed broken iconv interface behaviour in pipes
- fixed iconvcap misdetecting Latin5 as ISO-8859-5
- fixed casual `make distclean' failures
enca-0.10.7 2003-01-28
- fixed interchanged iconv and cstocs encoding names
- corrected(?) librecode surface interaction
- fixed a temporary file creation race condition
* added tex and utf8 to cstocs (names and b-cstocs)
enca-0.10.6 2002-10-22
+ enconv uses DEFAULT_CHARSET variable, exactly as recode
- ENCAOPT works everywhere, albeit imperfectly
- options -P and -p no longer imply -M too
- ambiguous mode (-M) works again
- pager is run so that help text doesn't disappear
- standard input it printed as STDIN with -d, not as null
- make check works again
- it compiles wihtout recode again
enca-0.10.5 2002-10-13
+ UTF-8 recognition in binary and otherwise messy files
+ detection of double-encoding from some 8bit charset to UTF-8
+ Cork encoding conversion
* librecode interaction was (hopefully) improved
- fixed some build-time problems
enca-0.10.4 2002-10-10
+ added Cork encoding support for Czech, Slovak and Polish
- empty files are now considered convertible to any encoding
- removed the so-called faster (in fact slower) I/O
- fixed some more compile-time search path issues
enca-0.10.3 2002-09-22
* added support for perl umap as external converter
- fixed external converter wrappers to work with standard sh
- fixed some compile-time library search path issues
enca-0.10.2 2002-09-15
+ target charset is automatically obtained from locales when called as
enconv, new options --guess, --auto-convert
+ English language names can be used instead of ISO-639 codes everywhere
- cs_SK and ru_UA locales are properly recognised as Slovak and Ukrainian
enca-0.10.1 2002-08-29
+ faster I/O
* external converters can be disabled at build time
- `-' is accepted for standard input
- fixed broken built-in converter
- fixed crasing on an unknown language
- trivial (identity) conversions are not performed any more
- help is now printed when input is a terminal and no argument specified
- changed braindamaged <STDIN>, <STDOUT> to STDIN, STDOUT in messages
- various small fixes and build-time improvements
enca-0.10.0 2002-08-26
+ added support for Ukraininan (CP1251, IBM855, ISO-8859-5, KOI8-U, maccyr
CP1125), Belarusian (CP1251, IBM866, ISO-8859-5, KOI8-UNI, maccyr,
IBM855) and Polish (ISO-8859-2, ISO-8859-12, ISO-8859-16, Baltic, macce,
IBM852, CP1250)
+ Enca library introduced
* dropped native Debian package
* --details no longer prints guessing details (now is mostly like --human)
* --list=encodings, --list=built-in-encodings corrected to --list=charsets,
--list-built-in charsets (old names supported with a warning)
* improved Czech and Slovak charsets detection
enca-0.9.4: 2002-03-03
- built-in converter didn't convert more than first 64kB of a file
enca-0.9.3: 2001-07-16
+ a native Debian package
- fixed random reporting of nonsense results
- fixed self-contradictory --details output when file was quoted-printable
encoded
- fixed poor performance on non-GNU/Linux
- made pager less intrusive (instead of intrusive `less' ;-)
- --list=encodings prints only `known' encodings
- fixed several compile-time/portability problems
enca-0.9.2: 2001-07-13
* --help and --license are displayed through pager (when possible)
- fixed broken language hooks--they were never activated (from 0.9.1)
- fixed reporting ASCII when a 7bit encoding was detected
- fixed boundary-case behaviour when recovering from librecode failures
enca-0.9.1: 2001-06-25
+ support for Macintosh Cyrillic, including conversion
+ support for unusual UCS-4 byte orders (3412 and 2143)
+ new option --license printing full enca license
* exit codes now make sense (0, 1, 2; where 2 means serious troubles)
- temporary files are no longer world-readable
enca-0.9.0: 2001-03-26
Serious incompatibilities:
* -E and -C option letters exchanged (much better mnemonics)
* converter wrappers renamed to b-cstocs and b-recode
* finding only 7bit ASCII is no longer considered failure
* need to use --language to set language (sometimes)
* dull converter behaviour no longer supported, -x syntax changed
* option -g removed (try --name=aliases)
* option -c changed to --list=converters, listing format changed
* option -l changed to --list=encodings, listing format changed
* converter names are no longer case insensitive
* no longer uses cstocs names as canonical
* external converters are called with Enca's names, not cstocs's
Other changes:
+ support for slovak and russian (and `none') language
+ support for CP1251, IBM866, ISO-8859-5 and KOI8-R, including conversion
+ UCS-2, UCS-4, UTF-8, UTF-7 and LaTeX encoding recognition
+ much more encoding aliases accepted
+ long `GNU style' command line options
+ new output types: --enca-name, --iconv-name
+ output type --name=WORD allowing to select output type by name
+ ENCAOPT environment variable
+ language detection from locales
+ support for surfaces (experimental)
+ new option --list printing various listings
+ new converter wrapper b-map (for perl `map')
+ new option -m to reset -M back
+ new language filters
+ new options -u and -U to control multibyte encoding checks
+ included [generated] enca.spec into the tarball to allow `rpm -tb'
* -d output improved
* read limit changed to 16MB
* librecode now run with flags diacritics_only and ascii_graphics
- fixed broken -P options
- fixed several build problems on non-GNU/Linux systems
- fixed some missing and wrong characters in Unicode data
- temporary copy of damaged original file is not deleted when rescue fails
enca-0.8.x: Since features planned for 0.8 and 0.9 happened to be developed
simultaneously, this version number has been skipped.
enca-0.7.7: 2001-01-01
+ ability to use UNIX98 iconv conversion functions
+ the word `none' can be used as -E parameter causing clearing of converter
list
- fixed disarranged help text, misspelled word `European' in macce long
name, obsolete statements in manual page and other stuff of this kind
enca-0.7.6: 2000-11-20
+ any converter combination/order can be now specified with -E, old -E
meaning is no longer valid
+ new option -c (list all valid converter names)
* cork encoding not supported anymore
* better verbosity
* `/' is added to recode recoding requests thus partially solving the
surface problem---surface never changes
* some errors like specifying invalid value of threshold are no longer fatal,
the bad values are ignored instead
* handling of some exotic characters in bulit-in converter slightly changed
- fixed several fatal bugs regarding stdin to stdout conversion
- stdin is copied to stdout in case of failure whenever possible/applicable
enca-0.7.5: 2000-10-25
* license changed to GNU GPL Version 2 (i.e. license version is explicitly
specified)
* prints error message when conversion is impossible
* binary data filter improved/changed
- fails back to external converter when GNU recode library cannot convert
due to errorneous request
- '' no longer causes enca to read from stdin
- tries to restore files damaged by GNU recode library
enca-0.7.4: 2000-10-12
+ box-drawing characters are (carefully) filtered out when guessing
- fixed intermixed behaviour in SMS/nonSMS modes
enca-0.7.3: 2000-10-09
+ blocks of probably binary data are filtered out when guessing
* standard input is copied to standard output when its encoding is unknown
- fixed reading only 4096 bytes from pipe (from 0.7.1)
enca-0.7.2: has been never released
+ GNU recode recoding chains made possible by starting -x (convert) parameter
with `..'
+ second best guess is marked with `-' in -d (print details) output
enca-0.7.1: 2000-10-02
* in case of nonfatal i/o failure enca continues processing remaining files
enca-0.7.0: 2000-09-26
+ standard input to standard output conversion
+ short message mode -M
+ ability to use GNU recode library
+ new output type -r (encoding name after RFC1345)
+ ability to convert cork internally
+ new external converter brecode (recode wrapper)
+ new output type -g (list of aliases)
+ new option -V (verbose)
* -x (convert) paramteres syntax changed to in_enc..out_enc (old syntax still
supported, will be removed in 0.8.x)
* option -e (disable external) no longer supported, empty string as -C
(external converter) parameter can be used instead
* encoding names specified as -x (convert) parameters are case insensitive
* ascii is not considered unknown encoding (i.e. failure) so enca returns 0
* -d (print details) output improved/changed/updated
* -p (prefix result with file name) no longer prints conversion details
* by default result is prefixed by file name when enca is run on more than
one file
enca-0.6.2: 2000-08-17
+ help texts (-h and -v) made usable (thanx to Halef)
enca-0.6.1: 2000-08-15
- tarball bugfix
enca-0.6.0: 2000-07-20
+ bulilt-in converter
+ -x (convert) can now take form -x in_enc,out_enc causing enca to behave
like a dull converter
+ new options -e and -E (disable internal/external converter)
+ new option -l (print internally-convertible encodings)
enca-0.5.0: 2000-07-17
* -p (prefix result with file name) causes enca to print what is converted
and how
* iso8859-2/cp1250 recognition improved
- doesn't spawn external converters as fast as is possbile, but waits for
them to return
- fixed `Unrecognized encoding' when winner is 1250 (from 0.4.3)
- corrected -d (print details) table alignment
enca-0.4.3: 2000-07-14
* -d (print details) prints encodings alphabetically sorted
- corrected short encoding name t1 -> cork
- division-by-zero bugfixes
enca-0.4.2: has been never released
* options -m/-M ([don't] use iso8892-2/cp1250 hack) no longer supported
- fixed showing standard input as empty string (<STDIN> is printed now)
enca-0.4.1: 2000-07-12
* default of 60 significant characters changed to 10
enca-0.4.0: 2000-07-10
+ first public release