Blame TODO

Packit 57a33d
#============================================================================
Packit 57a33d
# Enca v1.19 (2016-09-05)  guess and convert encoding of text files
Packit 57a33d
# Copyright (C) 2000-2003 David Necas (Yeti) <yeti@physics.muni.cz>
Packit 57a33d
# Copyright (C) 2009-2016 Michal Cihar <michal@cihar.com>
Packit 57a33d
#============================================================================
Packit 57a33d
Packit 57a33d
TO THE NEXT RELEASE:
Packit 57a33d
(this list must be empty at the time of release)
Packit 57a33d
Packit 57a33d
IN FUTURE:
Packit 57a33d
(should be done, but maybe not right now)
Packit 57a33d
Packit 57a33d
* LCUC check for cyrillic charsets.
Packit 57a33d
* Backups -- like cp, mv, etc.  This will be hard to get right with all the
Packit 57a33d
  silly converters.
Packit 57a33d
* More tests
Packit 57a33d
* Structured documentation (the manual page is ugly)
Packit 57a33d
  - keep a reasonably brief manual page
Packit 57a33d
  - put all the boring doc stuff somewhere else, there are possibilities:
Packit 57a33d
    info: searchable, has links, partly portable, has console viewers
Packit 57a33d
    HTML: poorly searchable, has links, most portable, has console viewers
Packit 57a33d
    TeX (ps): not searchable, no links, portable, most pleasant to read,
Packit 57a33d
          no console viewers
Packit 57a33d
    => use SGML (or info itself?) and generate the others
Packit 57a33d
Packit 57a33d
Packit 57a33d
MAYBE SOMEDAY:
Packit 57a33d
(when I will have mood for it, items are freely moved here and removed again)
Packit 57a33d
Packit 57a33d
* Detect all-caps texts OK.
Packit 57a33d
  After several experiments it seems we have to
Packit 57a33d
  - use pair occurences, at least, with specificaly computed
Packit 57a33d
    difference-maximising weights
Packit 57a33d
  - guess in two steps
Packit 57a33d
  - first with uncapitalization and pair weights, and check whether the
Packit 57a33d
    sample looks like natural text (garbageness test, but better)
Packit 57a33d
  - if the first approach fails, do it as we do it now
Packit 57a33d
* design better levels of verbosity/warnings (or: remove the --verbose option,
Packit 57a33d
  keep important messages and remove all others?)
Packit 57a33d
  0: only messages followed by exit(EXIT_FAILURE) (or abort()) are printed
Packit 57a33d
     plus `cannot convert...'
Packit 57a33d
  1: all nonfatal errors/warnings
Packit 57a33d
  2: what converters are tried, what language gets detected (do not duplicate
Packit 57a33d
     --details)
Packit 57a33d
  >2: debug
Packit 57a33d
* _real_ paranoiac behaviour assuring that nothing gets lost and that
Packit 57a33d
  conversion output is either correctly converted text or untouched original
Packit 57a33d
  (requires major redesign of all the conversion stuff)
Packit 57a33d
Packit 57a33d
Packit 57a33d
NEVER:
Packit 57a33d
(you can do anything GNU GPL v2 allows, but I'll restrain)
Packit 57a33d
Packit 57a33d
* features that nobody needs (mm, well, ... ok, let it be)
Packit 57a33d
* duplicate other tools functionality more than necessary, use them instead
Packit 57a33d
* dependency on anything that is not ISO C and/or POSIX (moreover do not use
Packit 57a33d
  braindead features of both); important functionallity must be present
Packit 57a33d
  everywhere nevertheless, enca can be smaller, faster or cleverer on some
Packit 57a33d
  (GNU) systems
Packit 57a33d
* localization; please correct my english instead ;->
Packit 57a33d
* converter calling generalization (would require inlcuding the whole wordexp
Packit 57a33d
  thing in enca, and: launching external converter is Bad Thing(TM) anyway)
Packit 57a33d
* data in run-time files (needs parser (could live with) and disallows hooks
Packit 57a33d
  (can't live without))
Packit 57a33d
* loadable module support (it's not very portable)
Packit 57a33d
-------------
Packit 57a33d
Packit 57a33d
Packit 57a33d
KNOWN ISO C CONFLICTS:
Packit 57a33d
(perhaps to be solved someday)
Packit 57a33d
Packit 57a33d
All constants and typedefs.  They start with ENCA_ and Enca, but:
Packit 57a33d
Packit 57a33d
  Names beginning with a capital `E' followed a digit or uppercase
Packit 57a33d
  letter may be used for additional error code names.               [errno.h]
Packit 57a33d
Packit 57a33d
And additionally inside libenca (i.e. not so serious):
Packit 57a33d
* libenca.h: #define EPSILON                                        [errno.h]
Packit 57a33d
* filters.c: isvbox[]                                               [ctype.h]
Packit 57a33d
* guess.c: #define isbinary                                         [ctype.h]
Packit 57a33d
* guess.c: #define istext                                           [ctype.h]
Packit 57a33d
* multibyte.c: is_valid_utf7()                                      [ctype.h]
Packit 57a33d
* multibyte.c: is_valid_utf8()                                      [ctype.h]
Packit 57a33d
Packit 57a33d
Some probably can't conflict.
Packit 57a33d