# This is ChangeLog for changes before enca became a library. # The new ChangeLog starts after that. # Unfortunately, there's no ChangeLog for the transition. # Note: EVERYTHING as changed file means global change, i.e. `every file where # this change was applicable was changed'. 2002-06-23 Yeti * src/texts.c, src/common.h, src/options.c: define e_isatty() as portable isatty in common.h, print help instead of wainting for input when stdin is a tty v0.9.4 2002-03-04 Yeti * src/printresult.c: fixed print_detailed_report() not to truncate charset names in the table * src/convert.c: put a no-op file_seek() after file_write() to make subsequent file_read() work correctly (an ISO C ,feature`, not Enca's) 2002-03-03 Yeti * src/convert.c: fixed convert_builtin() to convert whole files, not just first 64kB 2001-08-16 Yeti * src/options.c: removed strange `:' from start of short_options 2001-08-01 Yeti * src/common.h: defined e_free(x) to set x to NULL after freeing it * src/EVERYTHING: formal changes v0.9.3 2001-07-16 Yeti * configure.in, m4/librecode.m4, m4/iconv.m4, src/Makefile.am, src/epress.c: conversion libraries are put into CONVERTER_LIBS, so epress doesn't link with them (they are included in enca_LDADD), LIBS is not modified by librecode and iconv tests * iconvcap.c: added program_name definition for the case it would link with librecode (shouldn't happen) * configure.in: added AC_AIX * Makefile.am: added forgotten m4/crash-me to EXTRA_DIST * src/encnames.c: --list=encodings prints only `known' encodings * Makefile.am: added topline.sh (needed by update target) to dist * updated docs 2001-07-15 Yeti * src/lang.c: fixed giving random results due to usage of unitialized memory in get_charsets() * src/printresult.c: fixed using double instead of int for ambiguous and multibyte flags in T_EncDetails structure * src/guess.c, src/printresult.c, src/printresult.h: added GUESS_QP_RESOLVED flag to be able to print logically consistent -d output in case of quoted printables * src/fileio.c, configure.in: include sys/stat.h unconditionally, we cannot compile w/o it anyway, warning added to configure * src/common.h: include both string[s].h and memory.h. instead of just one of them * src/common.h: define PACKAGE, VERSION, DEFAULT_EXTERNAL_CONVERTER and DEFAULT_CONVERTER_LIST when don't have config.h to compile w/o it * src/options.c: fixed too many \n's in feature list * configure.in, src/texts.c: added isatty() and ttyname() test, defined stdout_isatty(), pager is run only when we are able to positively say stdout is a tty 2001-07-14 Yeti * src/options.c, src/common.h, src/texts.c: made USE_PAGER a feature +pager * m4/recode-bugs.m4: added TeX/..ISO-8859-2 recode crash test * m4/iconv.m4, src/convert_iconv.c: added ICONV_ARG2_CONST test * m4/typevar.m4, src/common.h, configure.in: removed the no longer needed long long int test, mere long int is used for mathint * configure.in: prepend "-Wall -pedantic" before CFLAGS when compiles is GCC * Makefile.am: add a new debian changelog entry when version or release changes, otherwise update time of the current entry 2001-07-13 Yeti * src/fileio.c, src/convert_recode.c: removed the no-buffering tricks making it slightly faster on modern GNU/Linux, but much slower everywhere else * src/EVERYTHING: put back the #ifdef HAVE_CONFIG_H stuff * src/texts.c: fixed typo expand_char() -> fputc() in poor man's compress * src/common.c: added const to stpcpy() *p declaration to keep qualifiers * m4/pager.m4, Makefile.am, configure.in, src/texts.c: test whether less accepts -F, possibility to disable pager at all v0.9.2 2001-07-13 Yeti * src/guess.c: fixed reporting `7bit ASCII characters' after a successfull detection of a 7bit encoding (TeX, UTF-7) * src/convert_recode.c: fixed typo HAVE_RECODEEXT_H -> HAVE_RECODEXT_H 2001-07-12 Yeti * src/filters.c: fixed hdata->eid's comparsion before their initialization in lang_hook_2cs() (so the hook was never run) * src/convert_recode.c: more correct handling of the situation when we realise we cannot seek in temporary file during recode failure recovery * debian/Makefile.am, m4/Makefile.am, Makefile.am, configure.in: removed the first two, debian/ generated by toplevel Makefile, m4/ is just distributed * Makefile.am, src/Makefile.am: replaced $< with literal file names, some `make's don't always substitute first dependence name * src/epress.c: added program_name definition to placate librecode * src/common.h: stdlib.h and unistd.h are included unconditionally (we depend on them anyway) * src/texts.c: replaced the `is pager less?' test with a better one * configure.in, acconfig.h, m4/EVERYTHING: put descriptions directly to AC_DEFINE[_UNQUOTED]'s, almost get rid of acconfig.h * src/EVERYTHING: don't trust make passing `-DHAVE_CONFIG_H -I..' to compiler, #include "../config.h" unconditionally (we depend on it anyway) 2001-07-10 Yeti * debian/, Makefile.am: created debian/ and modified the Makefile.am to include it 2001-07-09 Yeti * configure.in, enca.spec.in: added MAINTAINER variable which can be used in rpm spec and debian/ files * src/texts.c: fixed wrong prototype putchar() -> expand_char() when no compressor is available * src/texts.c, acconfig.h, configure.in: compressed texts are displayed through a pager (if available) * autogen.sh, README.devel: added and make a note about it v0.9.1 2001-06-25 Yeti * src/guess.c: fixed possible overflow in UCS-4 test (slightly changing what gets recognized as UCS-4) * src/guess.c: implemented unusual byteorders (3412 and 2143) tests using a single little-endian ucs-4 test what_if_it_was_ucs4() and shuffling bytes around shuffle_byte_order(), the same for UCS-4 EOL type tests * src/common.h: defined PVAR(f, v) [for debugging] * updated docs * packaged 2001-06-24 Yeti * src/encnames.h, src/encnames.c, src/guess.c: renamed SURF_PER_12[34] to more logical SURF_PER_[43]21 2001-06-02 Yeti * src/efilter.c, src/Makefile.am, src/texts.c, src/common.h: created efilter.c (filter making text files to compress better) and adding reverse filter to texts.c * m4/tolower.m4, configure.in, acconfig.h, src/common.h: define our own implementation of tolower and similar unconditionally * improved various docs * BUGS, Makefile.am: generated from manual page section of the same name * src/efilter.c, src/Makefile.am, src/texts.c, src/common.h: efilter was funny experiment but not much useful, removed again 2001-06-01 Yeti * src/getopt.c, src/getopt1.c, src/getopt_long.c, src/Makefile.am: removed the first two and made getopt_long.c non-generated, removed some unneeded stuff * src/common.c: put broken-{m,c,re}alloc fixes into conditionals (but no autoconf tests---I don't believe such a broken systems really do exist) 2001-05-28 Yeti * src/options.c: fixed prepend_env() segfault, improved diagnostics * revised temporary files usage, removed `/tmp bug' from docs * convert.c, convert_iconv.c, convert_recode.c: renamed tmpfile variable (synonymous to ISO function name) to tempfile, better not to tempt fate * script/b-map, script/b-cstocs, script/b-recode: added umask 077 2001-05-27 Yeti * src/encnames.c, src/encnames.h, man/enca.1: checks that all characters in charset/surface names are from some set of allowed characters * src/common.c: added hack for systems that fail on malloc(0) and similar 2001-05-20 Yeti * src/epress.c: forgotten [ISO C99] int16_t changed to int * src/filters.c, src/filters.h: added universal decide-between-2-cs hook * src/lang_cs.c, src/lang_sk.c: modified to use the universal hook, some formal changes * src/lang_ru.c: added maccyr/cp1251 hook (via the universal) * src/common.h, src/enca.c: sensible exit code (0, 1 or 2) returned * src/convert.c: convert() returns ERR_* error codes instead of just 0, 1 * src/common.c, src/convert.c, src/convert_iconv.c, src/encnames.c, src/fileio.c, src/lang.c, src/locale_detect.c, src/options.c, src/texts.c: returned 2 on troubles * updated docs 2001-05-18 Yeti * src/license.c, src/license.h, src/texts.c, src/texts.h: renamed the former to the latter * src/epress.c, src/texts.c: implemented bzip2 and gz interface * m4/compress.m4, configure.in: added tests for libbz2 and libz, the best one found is used * configure.in, src/Makefile.am, src/options.c: COPYING.c and HELP.c are generated by epress and just linked with enca * src/Makefile.am: put getopt_long.c into BUILT_SOURCES * src/lang_ru.c, src/unicodemap.c, src/encnames.c: added maccyr charset * Makefile.am: added hook to delete BUILT_SOURCES before making dist 2001-05-17 Yeti * src/fileio.c: temporary files are created with umask 077 2001-05-06 Yeti * src/epress.c, src/COPYING.h, src/Makefile.am: created license compressor * src/license.c, src/options.c: license decompression and printing 2001-05-01 Yeti * m4/librecode.m4, m4/recode-bugs.m4, m4/long-text.l2: put test for bugs to recode-bugs.m4 (four bugs are checked, any will launch the warning) 2001-04-12 Yeti * m4/librecode.m4, configure.in: added a test for broken recode (no workaround [known], just prints a big warning message) v0.9.0 2001-03-26 Yeti * just packaged 2001-03-25 Yeti * src/convert.c: fixed reading only first file block in copy_and_convert() * configure.in, src/EVERYTHING: memory.h test added, string/strings/memory header file inclusion put into common.h * src/fileio.c: fixed file_setvbuf() and file_open() (not enough magic) * src/convert_recode.c: fixed not opening original file * configure.in, src/convert_recode.c: added recodext.h test, is used when available for setting diacritics_only and ascii_graphics flags * src/convert_iconv.c: fixed not opening original file * src/convert_iconv.c: fixed writing no output in iconv_one_step() * src/convert.c: fixed intermixed child and parent process in convert_external() * src/fileio.c: added file_getline() requiring much less system calls than GNU libc's fgets() * src/locale_detect.c: fixed segfault when language detection failed * m4/tools.m4, configure.in, acconfig.h, src/options.c: consolidated autoconf macros regarding external converters * man/enca.1: minor corrections 2001-03-24 Yeti * src/encnames.c: added quoted printable surface * src/guess.c: integrated quoted printable into the guessing process * src/encnames.c, src/options.c: added print_public_surfaces() wrapper to allow printing --human-readable list of surfaces * src/lang_cs.c, src/lang_ru.c, src/lang_sk.c: replaced slovak and russian statistical data with some new (hope better), some hardcoded array sizes now computed from sizeof() * iconvcap.c: cosmetical changes 2001-03-20 Yeti * src/fileio.c, src/fileio.h, src/common.c, src/common.h: T_Buffer type moved from fileio to common * src/lang.c, src/lang.h, src/lang_cs.c, src/lang_ru.c, src/lang_sk.c: use the new T_Buffer type * configure.in: removed now-unused tests * src/EVERYTHING: removed redundant header inclusion, ensured satisfying header dependencies, try to use strings.h when string.h is not avaialble * src/guess.c: fixed not initializing memory buffer * src/options.c: exchagned -E and -C option letters (incompatibility!) * src/guess.c: implemented quoted printable recognition (unused now) 2001-03-19 Yeti * src/fileio.c, src/fileio.h: some more magic employed, only initiated can use them now * src/convert.c, src/convert_iconv.c, src/convert_recode.c, src/lang.c: changed to use the new file i/o interface * src/convert_iconv.c: fixed a memory leak 2001-03-18 Yeti * src/fileio.c, src/fileio.h: finished implementation * src/common.c, src/common.h: removed stuff belonging to fileio * src/locale_detect.c, src/guess.c, src.guess.h, src/lang.h, src/enca.c, src/convert.h: changed to use the new file i/o interface * src/guess.c: UCS tests check remainder of file length instead of buffer position (which is still used as a fallback in case of stdin) 2001-03-17 Yeti * configure.in, script/b-cstocs, script/b-cstocs.in, script/b-map, script/b-map.in, script/b-recode, script/b-recode.in, script/Makefile.am: (I'm stupid) reverted the last change * src/fileio.c, src/fileio.h: created (unique file i/o interface, at last) 2001-03-12 Yeti * m4/tools.m4: AC_MSG_WARN is used to print warnings * m4/iconv.m4: iconvenc.h is always created (the dark side is that iconv usability is no longer cached) * m4/tools.m4, configure.in, script/b-cstocs, script/b-cstocs.in, script/b-map, script/b-map.in, script/b-recode, script/b-recode.in, script/Makefile.am: external converters are located including path and it's then used for configure-time substitution in the scripts v0.9.0pre5 2001-03-12 Yeti * src/common.h, src/common.c, src/encnames.h, src/guess.c, src/options.c, src/printresult.c, src/convert.c: eliminated rest of Settings, replaced by module methods, except read_limit which is still global * enca.spec.in: various improvements (see the spec file changelog) 2001-03-11 Yeti * src/EVERYTHING: Settings.ProgName and Settings.Verbose made globals: program_name and verbosity_level * src/common.h, src/options.c, src/convert.c: Settings.ExtConverter made convert.c module global extern_converter, set by set_external_converter() * src/common.h, src/options.c, src/printresult.c: Settings.PreFName made printresult.c module global extern_converter, set by print_set_prefix_filename() 2001-03-10 Yeti * enca.spec.in: use global cache file 2001-03-09 Yeti * Makefile.am: (generated!) enca.spec included to distribution (rpm -tb) 2001-03-02 Yeti * man/enca.1: spell-checked * src/options.c: corrected available lists in --list= help 2001-02-28 Yeti * src/lang.c: language is printed in lang_init() when verbose * src/guess.c: all Latin, Cyrillic and Greek letters have equal weight in UCS-2 and UCS-4 tests * src/guess.c: fixed bad default endianess and initial makes-sense-check in UCS-4 test * src/printresult.c: UCS-* rating is printed, too v0.9.0pre4 2001-02-26 Yeti * just packaged 2001-02-25 Yeti * src/options.c, src/encnames.c, src/printresult.c: conversion from output type to what-name is done in encnames.c, introduced new output types and what-names ALIASES and NONE, --name=aliases prints list of aliases, removed --list=aliases, --list=encodings is sensitive to value of --name= instead, --list=names lists all valid --name= values, default output type is now NONE (initialized after option processing when needed) * updated rest of docs * configure.in, m4/libm.m4: fixed `make install' fail due to dependance on libtool: get rid of libtool dependance, created our own math library test 2001-02-24 Yeti * configure.in, acinclude.m4, m4/: removed acinclude.m4, created m4 dir and put the definitions there v0.9.0pre3 2001-02-23 Yeti * src/options.c: added --list=lists listing * NEWS: updated * man/enca.1: partly updated 2001-02-22 Yeti * configure.in: added feature and failure lists * acinclude.m4: cleanup, defined some *_ok variables for feature list, fixed some `! test ... = ...' to `test ... != ...' * srcipt/b-cstocs, srcipt/b-map, srcipt/b-recode: `#!/bin/bash' changed to `#! /bin/sh' (even if I'm not sure whether the scripts are 100% sh-compatible) * iconvcap.c: initial ASCII test changed back to ISO-8859-1 since the former is not supported on some systems (SunOS, IRIX, ...) having otherwise-usable iconv * src/getopt_long.c: generated from getopt.c and getopt1.c, but distributed (mainly because automake is unable to understand something we have in LIBOBJS can be generated...) v0.9.0pre2 2001-02-22 Yeti * configure.in, src/Makefile.am: merged src/getopt.c and src/getopt1.c to src/getopt_long.c to allow simple AC_REPLACE_FUNC() be used for getopt_long optional build * acinclude.m4: simplified iconv test and added check for libiconv 2001-02-21 Yeti * Makefile.am: corrected iconvconf.h to iconvenc.h * src/options.c: fixed undefined and misused ENCA_ENV_VAR when we don't have wordexp() to parse it * acinclude.m4: iconvenc.h is always created since I'm not able to explain automake we don't need it when we don't have iconv v0.9.0pre1 2001-02-20 Yeti * src/convert.c: trivial conversions from ASCII are carried out in built-in converter * src/EVERYTHING: cleaned terminology encoding vs. charset; encoding means (charset,surface) pair * src/options.c: converter names are case sensitive again * config.h is included only when HAVE_CONFIG_H * iconvenc.nul.h, acinclude.c, src/encnames.c: iconvenc.nul.h merged into encnames.c (neede only there when !HAVE_ICONV) * configure.in, src/getopt.c, src/getopt.h, src/getopt1.c, src/options.c: added GNU getopt so long options can be used everywhere (needs testing) * src/options.c: -m as complement to -M resets affected options to defaults 2001-02-19 Yeti * src/common.h, src/options.c, src/printresult.c: added iconv name output type, `-i' (only when HAVE_ICONV) * src/options.c: `-n' changed to `-e' * src/options.c: output types (except `-x') can be specified as -n NAME * iconvcap.c: updated for the new encodings and changed ICONV_ prefix to ICONV_NAME_, also changed the suffixes to pure alphanumeric, ISO-8859-1 test changed to ASCII * src/convert_iconv.c: updated to the new encoding/surface model * src/guess.c: try to detect swapped UCS-2 from the first byte pair, too * src/options.c: pointer woodoo to comply ISO C in print_some_list(): one more level of indirection in abbreviation table data * src/options.c, src/encnames.c: iconv name output type defined always, but when iconv names are not available error message is printed instead * iconvconf.h, iconvenc.h: the former renamed to the latter * iconvenc.h.nul: created. contains all iconv names defined to NULL * acinclude.m4, src/encnames.c, src/convert_iconv.c: iconvenc.h is either successfully generated by iconvcap or copied from iconvenc.h.nul so it always exists and is always #included * configure.in, src/options.c: getopt.h test (defines getopt_long(), etc.) * src/common.c: expand_abbreviation() returns pointer to whole abbreviation structure so it's possible to fetch the expanded name, too; all callers changed * script/bcstocs, script/brecode, script/b-cstocs, script/b-recode: the first two renamed to the second two * script/b-map: created. perl `map' wrapper 2001-02-18 Yeti * src/common.c: e_tmpfd() now returns empty string a file name when fails * src/options.c: max read limit increased to 16MB * src/options.c: default converter list changed, now is "built-in,librecode" for +librecode-interface, [+-]iconv-interface "built-in,iconv" for -librecode-interface, +iconv-interface "built-in" for -librecode-interface, -iconv-interface * src/convert_recode.c: cleaned-up to use the new encoding/surface model * src/convert.c, src/convert_common.h, src/convert_recode.c: request formatting (either for printing or for recode requests) function format_request_string() * src/convert_recode.c: temporary copy of original file is not deleted when rescue of damaged original fails (what a paranoia!) * src/common.c: e_fopen() and e_fclose(), fopen() and fclose() wrappers, unsed in various places * src/EVERYTHING: perror(NULL) changed to perror("") since some C libraries don't understand the former to print empty prefix string * src/encnames.c, src/options.c: added `surfaces' to --list option * src/convert.c: valid converter names are printed one per line * src/encnames.c: added iconv name to T_EncInfo, all defined to NULL when iconv interface is not build * src/common.c: implemented abbreviation searching expand_abbreviation() * src/options.c: print_some_list() uses the new abbreviation engine * src/convert.c: add_converter() uses the new abbreviation engine (needed some data types shake-up) * src/lang.c: got rid of `initialization discards qualifier...' warnings by shaking up with the consts (no functionality changed) 2001-02-17 Yeti * src/common.h: surface type formalized to surfint (instead of uint16) * src/encnames.c: implemented surface <-> name conversion * src/encnames.c, src/guess.c: changed UCS surfaces to recode style * src/options.c, src/enca.c: updated parse_x_arg(), removed dull-converter-like beahviour * src/locale_detect.c: get rid of str[c]spn() (they are ISO C, but autoscan tries to convince me I should check for them), flag-table approach is more efficient anyway * src/common.c: commented out strstrcount() (not needed) * src/encnames.c, src/encnames.h: empty surface SURF_REMOVE (to get only "/" as surface name) (dirty) 2001-02-14 Yeti * src/options.c: option letter space polluted by various listing options cleaned by introducing --list=WORD (incompatible with previous -l usage) * src/encnames.c: print_encoding_aliases() prints list of accepted encoding aliases (`recode -l' style) * src/encnames.c: encodings sorted alphabetically (by canonical name) * src/lang.c, src/lang_cs.c, src/lang_sk.c, src/lang_ru.c: ability to print what regular encodings belong to this particular language 2001-02-12 Yeti * src/lang.c: fixed not initializing language filter report for language `none' * src/encnames.h, src/encnames.c: introduced ENCF_MULTI flag marking multibyte encodings * src/printresults: heavily improved details (but still misses surfaces) * src/options.c: file name is printed with details except when user specifically asks not to print it * src/options.c: help texts updated * src/encnames.c, src/printresults.c: cstocs names of encodings not known to it changed to ???, but still printed the same in details * src/defaults.h: removed. all definitions moved to appropriate C files (or common.h) * src/encnames.c, src/options.c, src/prinresult.c: removed option `-g' * src/encnames.c: presqueezed aliases replaced by normal, we now squeeze names when needed (with hashing it's OK), also changes canonical names * config.h.top, configure.in: removed the former and add -D_GNU_SOURCE directly to CFLAGS * configure.in: removed warnings * configure.in: integer type sizes tested only when system doesn't provide stdint.h (and int's default changed to ISO C minimum) * configure.in, acinclude.h: moved my tests to the latter (newly created) to make the former more readable * guess.c: TEX_* and UTF_* defines moved to appropriate functions (as const) * guess.c: fixed missing print_flags_or() for language hooks (from ????) 2001-02-11 Yeti * src/encnames.c: hashing is used to find encoding names * src/options.c: fixed not recognizing output encoding (from 2000-02-02) * src/unicodemap.c: `-l' prints each group on separate line 2001-02-10 Yeti * src/guess.c: handle correctly case when a language has only one encoding * src/guess.c: multibyte tests use count table for fast rejecting * src/guess.c: surface detection, eol_surface() 2001-02-08 Yeti * src/lang_ru.c, src/encnames.c, src/unicodemap.c: added IBM 866 charset * src/filters.h: made ibm866 filter alias to keybcs2 filter (they're identical) * src/guess.c: saved stdin is not restored and `up' is not recomputed when nothing was filtered out 2001-02-06 Yeti * src/detect_lang.c, src/locale_detect.c: renamed the former to the latter * src/filters.c, src/filters.h, src/lang_cs.c: first two created (language filter repository) and all box-drawing filters moved there * src/filters.c: added bow-draing filters for more encodings * src/common.h, src/options.c, src/printresult.c: implemented -n output type (prints `canonical' encoding name used internally in enca) * src/lang_sk.c, src/lang_sk.h, src/lang_ru.c, src/lang_ru.h, src/lang.c, src/encnames.c, src/unicodemap.c: added Slovak and Russian languages and appropriate encodings * src/unicodemap.c: introduced the idea of compatible encodings; LATIN2 and CYRILLIC groups defined * all headers: revised header file dependencies * src/lang.c, src/locale_detect.c: implemented the notion of no language---when user sets language to `none' no regular encodings are processed (so when you say `-UL none' only pure ascii gets recognized) * src/options.c: fixed broken `-P' (an old bug) 2001-02-05 Yeti * src/lang.c, src/lang.h, src/lang_cs.c, src/lang_cs.h: created the first two and moved all regular encoding routines to them * src/options.c: implemented language settings (added Language to Settings) 2001-02-04 Yeti * src/unicodemap.c: is_subset_consistent() now uses translation table for checking * src/guess.c: incorporated multibyte encodings into the guessing process * src/guess.c: implemented an absolute likehood test of the applied to the relative winner 2001-02-03 Yeti * src/common.c, src/common.h: introduced type flagint for tables of flags (defined as short int), all callers changed * src/options.c: fixed accepting invalid option values, ReadLimit must be a multiple of 4 (to make UCS tests more reliable) * src/guess.c, src/printresult.c: added number of 8bit's (up) to details * src/guess.c: finished is_valid_utf8(), is_valid_utf7(), looks_like_tex(), looks_like_ucs2() and looks_like_ucs4() tests (except surfaces) 2001-02-02 Yeti * src/unicodemap.c: added many missing characters to maps * src/unicodemap.c: shortened maps by starting them from first character that doesn't map to itself * src/options.c: finally removed old dull-conversion syntax * src/options.c, configure.in: long `GNU style' options---when getopt_long() function is available, configure test added * src/options.c, configure.in: program_invocation_short_name is used when offered by system, otherwise strip_path() is used to make it * src/options.c, configure.in: value of environment variable ENCAOPT is prepended before command line options---when wordexp() function is available, configure test added 2001-02-01 Yeti * src/EVERYTHING: encoding are no longer identified by name, but integer eid is used; T_Encoding contains---beside eid---surface, but it's not used for anything yet * implemented name squeezing and alias recognition---almost any sensible (and a whole bunch of stupid) encoding identifier is recognized * introduced new, less stupid, `canonical' names---passed to external converter and looked-up faster than the others * src/guess.c, src/enca.c, src/printresult.c: result is passed as T_Encoding type * src/guess.c, src/printresults.c, src/lang_cs.c: print implemented as `object', guess details are fed by different functions when needed (no globals and circular dependencies anymore) * src/printresults.c: mapping from output type to encoding name is used instead of ugly switch () * src/unicodemap.c: character 0xa4 from koi8cs2 converted to tilde (0x7f) * src/options.c: refuse to serve as a dull converter when input encoding is not known to us * src/common.c, configure.in: implemented stpcpy() when not provided by system (configure test added) * src/guess.c, src/enca.c: 7bit ascii is no longer discriminated and no longer causes enca return nonzero error code 2001-01-30 Yeti * src/langdata_cs.c, src/langhook_cs.c, src/lang_cs.c: the first two merged to the third containing [almost?] all language specific stuff 2001-01-29 Yeti * src/encnames.c: created. (language independent encoding name handling) 2001-01-28 Yeti * src/common.h, configure.in: tests for stdint.h and integer type sizes used to define uint16 and uint32 * src/unicodemap.c: changed map storage type to uint16 thus saving several kilobytes, removed need for an empty last table * src/guess.c: added looks_like_ucs2(), looks_like_ucs4() and partially looks_like_utf7() 2001-01-27 Yeti * src/guess:c reimplemented look_like_TeX() in a more efficient way * src/EVERYTHING: static file globals used only in one function moved into the appropriate function * man/enca.1: made more human readable, different macros are used 2001-01-26 Yeti * src/detect_lang.c, src/detect_lang.h: detect_lang() accepts string parameter now * src/guess.c: implemented utf-8 parse test is_valid_utf8() and (La)TeX-encoded accents test look_like_TeX() 2001-01-25 Yeti * src/detect_lang.c, src/detect_lang.h, src/Makefile.am, src/defaults.h: guessing user's preferred language from locale (not used yet) * configure.in, acconfig.h: implemented check for locale.alias v0.7.7 2001-01-01 Yeti * e_read4() now consistently sets number of bytes in buffer to zero even when reading of zero bytes is requested (seems to break nothing, but...) * documentation synchronized 2000-12-31 Yeti * `none' is accepted as converter name and causes clearing the converter list * finished iconv interface * changed some messages, hope no one parses them * ssize_t availability checked by configure 2000-11-29 Yeti * long long int availability is now explicitely checked by configure * unsigned char -> byte * implemented e_write4() and e_read4() that allow to specify buffer address and size (e_read_with_limit replaced by e_read4()) * fixed files closed twice in copy_and_convert() * fixed bad return value tests for e_read() and e_write() in copy_and_convert() * implemented iconv_one_step() and enc_trans() for iconv, so it is almost usable on iconv-transitive (GNU) systems now 2000-11-28 Yeti * iconv transitivity is now explicitely checked by iconvcap instead of checking for gconv 2000-11-26 Yeti * misspelled `Europian' corrected to `European' in macce long name 2000-11-21 Yeti * fixed some nonsubstantial stuff, redundant #includes, docs and comments forgotten in too hastily released v0.7.6 * convert_iconv.c separated from convert.c v0.7.6 2000-11-20 Yeti * unicode mapping data made NULL terminated instead of fixed-length * discovered bug in gcc :-( * all upper-half-of-ascii structures promoted to 256 characters (too many changes to record here) * support for cork encoding ceased * help text updated and divided to sections (thus fixing warning about too long, ISO C violating, string) * get_in_enc_list() and get_converters() changed to print_* printing the lists, called directly from process_opt() * convert_recode.c separated from convert.c to make the amount of #if's bearable * encoding list made NULL terminated instead of fixed-length * reversed order of generating recoding table so, character with lowest 8bit code is always outputed for synonyms (instead of highest) * implemented converter flags CONV_EXTERN, we do not try to recover after external converter failure, since it's impossible by definition * xlat table cache bubble sorting on use * man page and other documentation synchronized 2000-11-19 Yeti * a simple test suite introduced in test/ (not distributed) * fixed stupid || -> && bug in e_close() causing failure in pipes * copy_and_convert() now processes what is saved in io_buffer always when called on stdin (not only when recoding is done too), this fixes bug in conversion to the same encoding in pipe * fixed comparing return value of e_tmpfd() with zero (instead of -1) causing not calling external converter in redirection * stdin is always copied to stdout when we are not able to perform conversion, irregardles of reason * some error messages improved 2000-11-18 Yeti * it's up to every converter to translate cstocs encoding names to its native names---implemented recode's enc_trans(), others use cstocs * rearranged converters to work in any order as expected, implemented ERR_* (internal) error codes in convert.c * implemented -c printing list of all valid converter names (get_converters()) * implemented e_tolower that cannot fail (and appropriate autoconf test) * split process_opt() to make less monster from it * error messages begin with enca file name without path (strip_path()) * some preliminary iconv support, the converter doesn't actually exist yet * errorneous values of most commandline parametres no longer cause enca to abort * request cache bubble sorting on use 2000-11-16 Yeti * fixed destroying outer and request after every use * implemented librecode request chache optimization * implemented e_unlink() * any converter combination/order can be specified on command line with -E option (meaning changed!) * external converter failure made non-fatal 2000-11-14 Yeti * bcstocs and brecode return exit status and message is printed when it fails 2000-11-05 Yeti * made iconv configure test, all configure results are cached 2000-10-31 Yeti * config.h is always included first (even before system headers) 2000-10-30 Yeti * fixed freeing not allocated request in convert_recode() * implemented some more verbosity in converter * strstr() alternative moved to common.c * implemented strstrcount() conting occurences of needle * '/' is added after both in_enc and out_enc when creating recode request string, partially helps with touchy librecode, but sometimes it fails anyway * monstrous convert_recode() split to several functions, nevertheless remains monstrous 2000-10-25 Yeti * more autoconf madness v0.7.5 2000-10-25 Yeti * strstr() defined in options.c when not provided by system library 2000-10-22 Yeti * copying policy changed to GNU/GPL version 2, explicitely, instead of version 2 or any later version * librecode autoabort feature finally disabled * fixed request cache initialization bug * fixed O_RDONLY zero/nonzero portability problem * convert_recode() restores file from temporary copy when librecode converter fails * encoding names are not duplicated but assigned in get_encodings() * heavy use of const modifier (too many changes to record here) * solved cyclic dependancy between langdata and langhook headers by making hookdata plain pointer (i.e. void*) and moving T_HookData definition to .c file 2000-10-18 Yeti * removed ffname_[rw]() filename wrapper in cases we cannot get stdin/stdout (like e_lseek()) * stdin/stdout is internally passed as NULL (instead of empty string) so one can no longer call enca '' '' to make it wait for several stdin's * return values of recode_new_outer() and recode_new_request() are checked * implemented librecode request caching, outer is now global and new request_cache too, neverhteless caching strategy is poor * implemented e_strdup() (strdup()) is not ISO/POSIX) and heavily used instead of strlen/malloc/strcpy sequences 2000-10-15 Yeti * fixed closing stderr in e_close() (though calling it with filedescriptor 2 would be bug anyway) * cleaned types of variables library functions are called with (unsigned long int -> size_t, etc.) * unsigned long long int is used when available for weight/occurence computations, mathint type introduced * cleaned names possibly conflicting with POSIX reserved names and libc header reserved names * recode_scan_request() return value is now checked for success * fails back to external converter when librecode cannot convert due to errorneous request (as one would understand from man page) * prints error message when conversion is impossible * merged all file-copying code in convert.c to copy_and_convert(), convert_internal_stdin() optimized out of existence * return codes in convert.c changed, -1 is now returned as error code only by low level i/o from common.c * get/generate_xtable() are no longer able to generate identity (not needed) 2000-10-14 Yeti * cleaned some T_GResult.boxout residua * addresses of language filter/hook reports printing functions passed in T_GResult * some #include <...> corrected to "..." for local files * corrected filter reports in SMS mode 2000-10-13 Yeti * binary filter is more drastic, requires BIN_MIN_TEXT_CHAR letters (instead of non-binary characters) to switch back to text mode v0.7.4 2000-10-11 Yeti * fixed behaviour in SMS mode (mismatched if([!]Settings.SMSMode); might even cause coredumps?) * implemented box-drawing character filter filter_boxdraw_out() and put it into langdata_cz.c, again with interface lang_filter(); not run in SMS mode (printresults.c updated accordingly) v0.7.3 2000-10-08/09 Yeti * fixed reading only 4096 bytes from pipe (by a crude way, even if trying to lower number of system calls as much as possible) from 0.7.1 * regenerated data (hope last time) and realized pair/... based guessing is nonsense * implemented filter that filters out blocks of probably binary data (stdin is saved when conversion is required), filter_binary_out() number of filtered characters is printed by -d * stdin is copied to stdout when its encoding is unknown (much more logical, but can break existing scripts) v0.7.2 2000-10-05 Yeti * corrected F_EMPTY message (F_EMPTY now really means file is empty) * second best is marked in -d output by `-' (added p_esec to T_GResult) 2000-10-04 Yeti * *_cs filenames corrected to *_cz (Czechoslovakian -> Czech) * lang_hook() divided to interface lang_hook() and lang_hook_stz() doing the real work (preparation for more langhooks) * information about active language hooks saved in active_hooks (for printing) * perhaps solved language hook info printing dependency (when -d): new function print_lang_hook_data() is defined directly in langhook_cz and called in print_results 2000-10-02/03 Yeti * recoding chains made possible by starting -x parameter with `..' * BSD [s]random() changed to ISO [s]rand() * fixed all remaining warnings except 1.4 kB long string containing help text (compile tried with -Werror -Wall -pedantic -ansi -Wstrict-prototypes -Wmissing-prototypes -Wmissing-declarations -Wnested-externs) v0.7.1 2000-10-01 Yeti * moved to low level i/o: e_fopen() changed to e_open(), etc., e_write(), etc. introduced, convert_die() abadoned * return codes slightly changed to distinguish between guessing failure and i/o failure * get_xtable() and generate_xtable() changed to be able to generate identity (that is then used in convert_internal_stdin()) * not all i/o failures cause aborting * documentation improvements and corrections, as usual 2000-09-30 Yeti * temporary file creation is tried 3x and existing temporary files are not overwirtten v0.7.0 2000-09-25/26 Yeti * removed option `disable external converter' (-e), empty -C parameter can be used instead * e_tmpf() divided to e_tmpf() creating the file and e_tmpfname() generating temporary file name * fixed memory leak (not freeing tmpfname) in convert_recode() and convert_external() * man page updated 2000-09-24/25 Yeti * fixed io_pos == 0 bug when enca converts stdin as dull converter * unicode.* renamed to unicodemap.* * -x encoding separator changed to .. to be compatible with recode and comply RFC 1345 (allowing comma in encoding names) * multiple .. allowed in -x argument to make possible specify recode chains (but only recode understand them) * ffname() divided to ffname_r() and ffname_w(), one returns stdin name one stdout name * implemented convert_recode() converting via recode library * conditionaly added program_name and other stuff requred by librecode * == changed to = in bcstocs and brecode to make them work in older bash * conversion to the same encoding no longer causes warnings * conversion to the same encoding works correctly even for stdin * ascii no longer considered unknown encoding * introduced verbose option (-V), converter now prints what is doing on -V not on -p * librecode interface _disabled_ by default in configure.in 2000-09-23/24 Yeti * two underscores removed from begin (and end) of #defines to comply ISO C * guess read buffer and convert read/write buffer merged (the same applies to BUFFER_SIZE and Settings.ReadLimit) and this io_buffer made persistent (created by new functions enca_init(), destroyed by enca_done()) so information from stdin does not get lost * user specified read limit is rounded up to nearest multiple of 16 * convert_internal() splitted to convert_internal_file() and convert_internal_stdin(), that converts stdin to stdout * introduced e_tmpf() creating temporary files since none of glibc functions does The Right Thing (execept maybe non-POSIX tempnam()) * convert_external() creates temporary file and put stdin there if stdin is to be converted and passes `-' as fourth parameter to converter instructing it to send output to stdout * bcstocs and brecode rewritten to recognize the fourth parameter and put under GNU GPL * a new output type OT_ALIAS (option -g) introduced that lists all known aliases and -f does approximately the same as in 0.6 serie again * resloved strange dependencies between langdata and langhook * fixed e_tmpf() passing-by-value bug * encoding names specified as -x parameters are converted to lowercase * fixed terrible typo || -> && in bcstocs and brecode 2000-09-22/23 Yeti * introduced recode wrapper script, brecode * introduced is_subset_consistent() checking if characters have the same meaning in two encodings (unfortunately makes guess.c dependent on unicode.h) * implemented -M (by function sms_hook()) * activation and usefulness of sms_hook() reported by -d * added Cork encoding to unicode.c to make -M useful (but conversion from/to Cork is still quite bad, and should be done by cstocs/librecode only) * implemented e_malloc(), e_calloc() and e_realloc() aborting on failure, declared free() in common.h, defined NEW() allocator [this one is really braindead] 2000-09-21/22 Yeti * fixed not-initalizing-encoding-table coredump * added new output type OT_RFC1345 (option -r), some OT's renamed * added -r and -M options (-M accepted, but not implemented yet) * prefixing with filename is now on by default when run on more than one file * added RFC 1345 names and in consequence renamed name* memebers of S_EncRaw and S_EncStat to keep namespace consistent * much more (in fact all known) aliases listed in full names -- maybe too many * in detailed output is now file name printed on the top and cstocs and RFC 1345 names are printed separately in (bottom) result part * hook name is not fixed, but part of langdata_cs.c (still not very clean, though) 2000-09-17/18 Yeti * moved to autoconf/automake (not respected by the C sources yet) * reorganized to deep package * fixed `cd . && pwd` bug in automake :-( * old config.h renamed to defaults.h * rawdata.c and rawuni.c merged with encdata.c and encuni.c mainly to avoid problems with automake * bcstocs now prints error message when it fails, to make things clear * encopt* renamed to options*, encuni* to unicode* and encdata* to langdata* * enca.c split to convert.c, guess.c, printresult.c and langhook.c, header files introduced, some functions renamed and their parameters changed (too many changes to list here) (greetings to Halef) * process_file() now only controls what actions should be taken and no longer does anything itself * print_results() no longer calls converter, this is done in process_file() * results of guessing passed encapsulated in struct S_GResult * language specific files (lang*) got _cs extension * HELP_TEXT and VERSION_TEXT defined as strings instead of macros * common stuff moved to common.c/common.h [enca now compiles but doesn't work yet] v0.6.2 2000-08-17 Yeti * help texts (-h and -v) improved, thanx to Halef * some other minor changes in docs v0.6.1 2000-08-15 Yeti * tarball repacked with files encuni.c, rawuni.c, rawuni.h missing in 0.6.0 tarball * bcstocs magic line changed to #!/bin/bash * TODO updated to reflect current work on 0.7 v0.6.0 2000-07-20 Yeti * formal changes in rawdata.c * internal converter implemented (rawuni.c, encuni.c, enca.c) * -x can now take form -x in_enc,out_enc * introduced options -e and -E * introduced option -l * READLIMIT_MAX changed back to 1MB * man page improved(?) v0.5.0 2000-07-17 Yeti * waits for converter to return (much slower conversion, but doesn't produce `cannot fork()'s) * -p makes -x print what is doing * fixed `Unrecognized encoding' when winner is 1250 (from 0.4.3) * _exit() in converter caller corrected to exit() since we don't use vfork() * use EXIT_SUCCESS and EXIT_FAILURE instead of 0 and 1 * aborts when it cannot close open file (since it means something very bad is happening) * added forgotten z-with-check to il2/1250 hack * il2/1250 hack put to separate function * (minimalistic) config.h introduced * READLIMIT_MAX put to config.h and changed to 4MB * recomputed statistical data (probably not the last time) * also other defaults put to config.h * corrected -d table alignment * fixed not initializing significancy table * added -f to cp and rm in Makefile * some other code clean-ups v0.4.3 2000-07-14 Yeti * corrected short encoding name t1 -> cork * corrected some dividing by zero resulting inf/nan rating * il2/1250 hack made more logical * -d prints encodings alphabetically sorted * significancy table is computed only once v0.4.2 2000-07-13 Yeti * removed options -m/-M since it's nonsense to maintain something marked `do not use' in man page * -p now shows stdin as , instead of empty string * ascii and ??? made `encodings' * bcstocs doesn't depend on mktemp * some other code clean-ups v0.4.1 2000-07-12 Yeti * silly default of 60 significant characters changed to 10 * minor bugfixes and corrections v0.4.0 2000-07-10 Yeti * rewritten from scratch in ISO C