Blame README

Packit 2be50e
README-file for the distribution of the Norwegian dictionaries for ISPELL.
Packit 2be50e
Packit 2be50e
DESCRIPTION
Packit 2be50e
Packit 2be50e
This distribution contains a big collection of Norwegian words (both
Packit 2be50e
bokmål and nynorsk) and support files to make useful things from it.
Packit 2be50e
Packit 2be50e
The main file norsk.source contains 747500 words from the Norwegian
Packit 2be50e
language.  Each word has a commonness indicator, and it is hyphenated
Packit 2be50e
at compound points.
Packit 2be50e
Packit 2be50e
There is also a Makefile to assist in building dictionaries for Ispell
Packit 2be50e
and other word processors, using a sensible subset of the available
Packit 2be50e
words.  There is also a Makefile in the patterns directory which makes
Packit 2be50e
hyphenation patterns for TeX based on the dictionary and a simple set
Packit 2be50e
of hyphenation patterns that works on non-compound words.
Packit 2be50e
Packit 2be50e
The latest version is available at
Packit 2be50e
Packit 2be50e
http://spell-norwegian.alioth.debian.org/
Packit 2be50e
Packit 2be50e
Comments, suggestions and bug-reports to i18n-no@lister.ping.uio.no.
Packit 2be50e
Packit 2be50e
There is also a slashdot project with a similar goal.  We should try to
Packit 2be50e
join forces with them.  <URL:http://sourceforge.net/projects/spell-no>
Packit 2be50e
Packit 2be50e
Packit 2be50e
Packit 2be50e
BUILDING A NORWEGIAN ISPELL DICTIONARY
Packit 2be50e
Packit 2be50e
* Get the ispell sources and unpack it.
Packit 2be50e
Packit 2be50e
  cd /source
Packit 2be50e
  tar -zxvf ispell-3.1.20.tar.gz
Packit 2be50e
Packit 2be50e
  You can also unpack the sources for the Norwegian dictionary now:
Packit 2be50e
Packit 2be50e
  cd ispell-3.1/languages
Packit 2be50e
  tar -zxvf ispell-norsk-2.0.tar.gz
Packit 2be50e
Packit 2be50e
* Patch Ispell
Packit 2be50e
Packit 2be50e
  I have made a patch for ispell based mainly on other patches found
Packit 2be50e
  on the net.  If you think you have found a bug in ispell, please
Packit 2be50e
  make sure that it has nothing to do with this patch before
Packit 2be50e
  reporting it to the ispell manager!
Packit 2be50e
Packit 2be50e
  The following things are done:
Packit 2be50e
  
Packit 2be50e
  1. An attempt is made to fix the backslash bug.  The patch for this
Packit 2be50e
     was found at Ken Stevens ispell.el site.
Packit 2be50e
  
Packit 2be50e
  2. Ispell can now parse html files thanks to a patch by Gerry
Packit 2be50e
     Tierney.  Basically this means that a patched copy of ispell will
Packit 2be50e
     ignore any mark-up tags or html entities in a html document when
Packit 2be50e
     spell checking that document.  Any text inside an 'alt' attribute
Packit 2be50e
     will however be checked.
Packit 2be50e
  
Packit 2be50e
     Examples: ispell index.html     # html tags will be ignored
Packit 2be50e
               ispell -h README      # html tags will be ignored
Packit 2be50e
               ispell -n index.html  # html tags will be spell-checked
Packit 2be50e
Packit 2be50e
     I have not been able to make the html mode work well when using
Packit 2be50e
     ispell from emacs.  That doesn't matter too much, since ispell.el
Packit 2be50e
     has its own skipping mechanism.
Packit 2be50e
  
Packit 2be50e
  3. Buildhash now accepts all characters between A and z as flags,
Packit 2be50e
     not only the alphanumeric ones when MASKBITS=64.  This is needed
Packit 2be50e
     by the Norwegian affix file.
Packit 2be50e
  
Packit 2be50e
  4. The AMS and breqn math environments are now skipped by ispell.
Packit 2be50e
  
Packit 2be50e
  5. Ispell gets the ability to suggest "- as a separation character
Packit 2be50e
     in addition to - and space.  This only happens if such support is
Packit 2be50e
     compiled in, e.g. the COMPOUNDBABEL flag must be defined, and it
Packit 2be50e
     only happens in TeX mode and if the language is norsk.  It is
Packit 2be50e
     useful to mark compound points in words to ensure good
Packit 2be50e
     hyphenation when using LaTeX with Babel.  The Norwegian
Packit 2be50e
     hyphenation patterns distributed in this package hyphenate almost
Packit 2be50e
     every word in the Ispell dictionary correctly, but no guaranty is
Packit 2be50e
     offered for other compound words.
Packit 2be50e
Packit 2be50e
  6. Added an -r switch, which is almost like the -a switch, but the
Packit 2be50e
     suggestions are printed even if the word is found in the
Packit 2be50e
     dictionary.  This is useful for hyphenating words and for
Packit 2be50e
     eliminating rare words close to very common words.  There has to
Packit 2be50e
     be some german out there wanting to make TeX hyphenate only
Packit 2be50e
     compound words.
Packit 2be50e
Packit 2be50e
  7. Added a patch from the Redhat rpm to avoid compilation error in
Packit 2be50e
     ijoin.c.
Packit 2be50e
  
Packit 2be50e
  So if you are feeling a little brave;
Packit 2be50e
Packit 2be50e
  cd ispell-3.1
Packit 2be50e
  patch < languages/norsk/ispell-3.1.20.no.patch
Packit 2be50e
Packit 2be50e
  Additional patches might be needed on various systems.  The Redhat
Packit 2be50e
  source RPM is a good place to look if something fails.
Packit 2be50e
Packit 2be50e
* CONFIGURE ISPELL The file Config.X in the ispell-3.1 distribution
Packit 2be50e
  contains configuration information for ispell (no ./configure yet).
Packit 2be50e
  The definitions are overridden by those in the file local.h, for
Packit 2be50e
  which there is a local.h.samp.  The following local.h works for me
Packit 2be50e
  on my Redhat-6.0 system.  You have to adopt the file to those
Packit 2be50e
  languages you have dictionaries for.
Packit 2be50e
Packit 2be50e
-----------------------------------------------------------------------
Packit 2be50e
#define MINIMENU        /* Display a mini-menu at the bottom of the screen */
Packit 2be50e
#define USG             /* Define this on System V */
Packit 2be50e
Packit 2be50e
#define BINDIR  "/usr/bin"
Packit 2be50e
#define LIBDIR  "/usr/lib"
Packit 2be50e
#define MAN1DIR "/usr/man/man1"
Packit 2be50e
#define MAN4DIR "/usr/man/man4"
Packit 2be50e
Packit 2be50e
#define LANGUAGES "{american,MASTERDICTS=american.med+,HASHFILES=americanmed+.ha
Packit 2be50e
sh,EXTRADICT=/usr/dict/words} {norsk}"
Packit 2be50e
#define MASKBITS 64
Packit 2be50e
#define LOOK     "look"
Packit 2be50e
#define CFLAGS   "-O3"  /* Mostly to speed up my batch operations */
Packit 2be50e
#define LDFLAGS  "-s"
Packit 2be50e
#define COMPOUNDBABEL
Packit 2be50e
-----------------------------------------------------------------------
Packit 2be50e
Packit 2be50e
  It might be wise to try to build ispell only for English, to test that
Packit 2be50e
  everything works, and add new languages afterwards.
Packit 2be50e
Packit 2be50e
  cd ispell-3.1
Packit 2be50e
  make all
Packit 2be50e
Packit 2be50e
  This takes some time, but almost nothing compared to building the
Packit 2be50e
  Norwegian dictionary.
Packit 2be50e
Packit 2be50e
* ADD LANGUAGES
Packit 2be50e
Packit 2be50e
  Get dictionaries for the languages you want to install from the
Packit 2be50e
  ispell home page.  Unpack them in the appropriate directories.
Packit 2be50e
  Update the LANGUAGES variable in local.h and remake.
Packit 2be50e
Packit 2be50e
  Make sure that there is enough free space to build the dictionary.
Packit 2be50e
  If it isn't the build process will loose miserabely. About 120 MB is
Packit 2be50e
  needed!
Packit 2be50e
Packit 2be50e
  The Norwegian dictionary can be configured.  You can choose which
Packit 2be50e
  categories of words to include, and how common a word has to be to
Packit 2be50e
  be included.  This is documented in the Makefile in languages/norsk.
Packit 2be50e
  This flexibility has its price; it takes a very long time and a lot
Packit 2be50e
  of disk space to build the dictionary, up to 120Mb.
Packit 2be50e
Packit 2be50e
  You can also customize the affix file to remove or add some forms of
Packit 2be50e
  words.  For example you could choose to allow or disallow the
Packit 2be50e
  spelling `komitéen'.  To do this you can make the file norsk.aff,
Packit 2be50e
  edit it according to your needs, and make norsk.hash afterwards.
Packit 2be50e
  Look for the word `valgfritt' in the file.  Bear in mind that
Packit 2be50e
  norsk.aff will is dependent on norsk.aff.in, so if you touch that
Packit 2be50e
  file your version will be overwritten.  It will not work as expected
Packit 2be50e
  to change norsk.aff.in.
Packit 2be50e
Packit 2be50e
* INSTALL
Packit 2be50e
Packit 2be50e
  Before you install, you might want to test if ispell works.
Packit 2be50e
Packit 2be50e
  cd languages/norsk
Packit 2be50e
  echo vurderingskriterier | ../../ispell -a -d norsk.hash
Packit 2be50e
Packit 2be50e
  should find vurderingskriterium.  Then
Packit 2be50e
Packit 2be50e
  make install
Packit 2be50e
Packit 2be50e
Packit 2be50e
USING THE DICTIONARY
Packit 2be50e
Packit 2be50e
CHARACTER SETS
Packit 2be50e
Packit 2be50e
By default ispell assumes you use latin-1 encoding in your Norwegian
Packit 2be50e
files.  To spell-check such a file you just say
Packit 2be50e
Packit 2be50e
ispell -d norsk mythesis.tex
Packit 2be50e
Packit 2be50e
In TeX you can use `{\aa}', `{\oe}', `{\o}', `\'e', `\'o' and `\^o' to
Packit 2be50e
represent the special Norwegian characters.  If you do this, you have
Packit 2be50e
to say
Packit 2be50e
Packit 2be50e
ispell -T plaintex -d norsk mythesis.tex
Packit 2be50e
Packit 2be50e
to spell-check a file.  The characters æøåéòô will not be recognized
Packit 2be50e
then, so unfortunately you have to choose one standard.  If you use
Packit 2be50e
`\aa{}' etc. instead, you should change the affix file or add a
Packit 2be50e
similar entry in the affix file.
Packit 2be50e
Packit 2be50e
In a plain ASCII file `æ ø å' are sometimes represented `ae oe aa'.
Packit 2be50e
Use
Packit 2be50e
Packit 2be50e
ispell -T ascii -d norsk mythesis.tex
Packit 2be50e
Packit 2be50e
to spell-check such a file.
Packit 2be50e
Packit 2be50e
The iso246 encoding puts æøå after z in the collating sequence.
Packit 2be50e
If you use this encoding, say
Packit 2be50e
Packit 2be50e
ispell -T iso246 -d norsk mythesis.tex
Packit 2be50e
Packit 2be50e
Does anybody use this??
Packit 2be50e
Packit 2be50e
Packit 2be50e
COMPOUND WORDS
Packit 2be50e
Packit 2be50e
The use of compound words is what makes it both fun and difficult to
Packit 2be50e
produce a good and secure ispell dictionary and to make hyphenation
Packit 2be50e
patterns for TeX.
Packit 2be50e
Packit 2be50e
Ispell has two very important switches, -B and -C, controlling whether
Packit 2be50e
ispell accepts words formed by a root and another word as correct.  If
Packit 2be50e
the -C flag is given, ispell will accept words as
Packit 2be50e
`avdelingsbestyrerstilling', which is right, but also words as
Packit 2be50e
`premierene' (premie-rene), which is wrong.  It is *not recommended*
Packit 2be50e
to use the -C option with the Norwegian dictionary, since far to many
Packit 2be50e
incorrect spellings will be accepted.
Packit 2be50e
Packit 2be50e
If you don't give the -B or -C flag, ispell will accept compound words
Packit 2be50e
formed by a small subset of the words in the dictionary. The subset
Packit 2be50e
depends on the configuration variables in the Makefile. This is called
Packit 2be50e
controlled compoundwords mode.  It is even more safe to give the -B
Packit 2be50e
option, such that only words in the dictionary are regarded as
Packit 2be50e
correct.  I would do that if I had written something important.
Packit 2be50e
Packit 2be50e
The hyphenation patterns for TeX are only tested on words in the
Packit 2be50e
dictionary, so these patterns might fail on compound words accepted in
Packit 2be50e
controlled compoundwords mode.  If you want to be absolutely certain
Packit 2be50e
that there will be no bad hyphens in your document, you have to use
Packit 2be50e
the -B switch.  See `The hyphenation problem' below.
Packit 2be50e
Packit 2be50e
Packit 2be50e
FIGHTING `ORD DELINGS SYNDROMET'
Packit 2be50e
Packit 2be50e
Most spell checkers, including ispell, suggest to split compound words
Packit 2be50e
it doesn't find in its dictionary.  If people follow these suggestions
Packit 2be50e
blindly, the result is disaster; they get spelling errors in the
Packit 2be50e
actual document and even worse; they think they have learned the
Packit 2be50e
correct spelling! (arkitekt tegnet hus i Holmenkoll åsen...)
Packit 2be50e
Packit 2be50e
I have done two things to fight this.  Ispell suggests `"-' in
Packit 2be50e
addition to `-' and ` ' for compound words, which tells TeX that here
Packit 2be50e
is a compound point and makes the spell-check skip the word next time.
Packit 2be50e
Packit 2be50e
The second thing is more important.  The script inorsk-maybecompound
Packit 2be50e
searches a document (or standard input) for two and three words
Packit 2be50e
following each other that can be written in one word, hyphenates them
Packit 2be50e
using TeX and prints the compound words to standard output.  By
Packit 2be50e
hyphenating one avoids words like sommer (som mer), forlenge (for
Packit 2be50e
lenge) etc.  Use it!
Packit 2be50e
Packit 2be50e
Packit 2be50e
EMACS
Packit 2be50e
Packit 2be50e
The version of `ispell.el' distributed with emacs-19.34 does not
Packit 2be50e
support Norwegian.  I suggest you get the latest ispell.el from
Packit 2be50e
ftp://kdstevens.com/pub/stevens/ispell.el.gz.  Good versions are also
Packit 2be50e
found in emacs-20.[4567].
Packit 2be50e
Packit 2be50e
So make sure that your version of ispell.el uses the variable
Packit 2be50e
ispell-local-dictionary-alist, and put a suitable subset of the
Packit 2be50e
following in your .emacs file:
Packit 2be50e
Packit 2be50e
(setq
Packit 2be50e
 ispell-local-dictionary-alist
Packit 2be50e
 '(("norsk"                             ; 8 bit Norwegian mode
Packit 2be50e
    "[A-Za-z\305\306\307\310\311\322\323\324\330\345\346\347\350\351\362\363\364\370]"
Packit 2be50e
    "[^A-Za-z\305\306\307\310\311\322\323\324\330\345\346\347\350\351\362\363\364\370]"
Packit 2be50e
    "[\".,;:]" t ("-B" "-S" "-d" "norsk") "~list" iso-8859-1)
Packit 2be50e
   ("norsk7-tex"                        ; 7 bit Norwegian TeX mode
Packit 2be50e
    "[A-Za-z{}\\'^`@]" "[^A-Za-z{}\\'^`@]"
Packit 2be50e
    "[\".,;:]" t ("-B" "-S" "-d" "norsk" "-T" "plaintex") "~plaintex" nil)
Packit 2be50e
   ("norsk7-html"                       ; 7 bit Norwegian html mode
Packit 2be50e
    "[A-Za-z\&;]" "[^A-Za-z\&;]"        ; Don't use ispell's html-parser
Packit 2be50e
    "[.,:]" t ("-B" "-S" "-n" "-d" "norsk") "~html" iso-8859-1)
Packit 2be50e
   ("norsk7-ascii"                      ; 7 bit Norwegian (aa, ae, oe)
Packit 2be50e
    "[A-Za-z]" "[^A-Za-z]"
Packit 2be50e
    "[\".,;:]" t ("-B" "-S" "-d" "norsk") "~ascii" iso-8859-1)
Packit 2be50e
    ("norsk7-iso246" "[][A-Za-z{}|\\]" "[^][A-Za-z{}|\\]"
Packit 2be50e
     "[\".,;:]" nil ("-B" "-S" "-d" "norsk")  "~iso246" iso-8859-1)
Packit 2be50e
   ("norsk-comp"                        ; 8 bit Norwegian mode
Packit 2be50e
    "[A-Za-z\305\306\307\310\311\322\323\324\330\345\346\347\350\351\362\363\364\370]"
Packit 2be50e
    "[^A-Za-z\305\306\307\310\311\322\323\324\330\345\346\347\350\351\362\363\364\370]"
Packit 2be50e
    "[\".,;:]" t ("-S" "-d" "norsk") "~list" iso-8859-1)
Packit 2be50e
   ("norsk7-tex-comp"                   ; 7 bit Norwegian TeX mode
Packit 2be50e
    "[A-Za-z{}\\'^`@]" "[^A-Za-z{}\\'^`@]"
Packit 2be50e
    "[\".,;:]" t ("-S" "-d" "norsk" "-T" "plaintex") "~plaintex" nil)
Packit 2be50e
   ("norsk7-html-comp"                  ; 7 bit Norwegian html mode
Packit 2be50e
    "[A-Za-z\&;]" "[^A-Za-z\&;]"        ; Don't use ispell's html-parser
Packit 2be50e
    "[.,:]" t ("-S" "-n" "-d" "norsk") "~html" iso-8859-1)
Packit 2be50e
   ("norsk7-ascii-comp"                 ; 7 bit Norwegian (aa, ae, oe)
Packit 2be50e
    "[A-Za-z]" "[^A-Za-z]"
Packit 2be50e
    "[\".,;:]" t ("-S" "-d" "norsk") "~ascii" iso-8859-1)
Packit 2be50e
    ("norsk7-iso246" "[][A-Za-z{}|\\]" "[^][A-Za-z{}|\\]"
Packit 2be50e
     "[\".,;:]" nil ("-B" "-S" "-d" "norsk")  "~iso246" iso-8859-1)
Packit 2be50e
("nynorsk"                             ; 8 bit Norwegian mode
Packit 2be50e
    "[A-Za-z\305\306\307\310\311\322\323\324\330\345\346\347\350\351\362\363\364\370]"
Packit 2be50e
    "[^A-Za-z\305\306\307\310\311\322\323\324\330\345\346\347\350\351\362\363\364\370]"
Packit 2be50e
    "[\".,;:]" t ("-B" "-S" "-d" "nynorsk") "~list" iso-8859-1)
Packit 2be50e
   ("nynorsk7-tex"                        ; 7 bit Norwegian TeX mode
Packit 2be50e
    "[A-Za-z{}\\'^`@]" "[^A-Za-z{}\\'^`@]"
Packit 2be50e
    "[\".,;:]" t ("-B" "-S" "-d" "nynorsk" "-T" "plaintex") "~plaintex" nil)
Packit 2be50e
   ("nynorsk7-html"                       ; 7 bit Norwegian html mode
Packit 2be50e
    "[A-Za-z\&;]" "[^A-Za-z\&;]"        ; Don't use ispell's html-parser
Packit 2be50e
    "[.,:]" t ("-B" "-S" "-n" "-d" "nynorsk") "~html" iso-8859-1)
Packit 2be50e
   ("nynorsk7-ascii"                      ; 7 bit Norwegian (aa, ae, oe)
Packit 2be50e
    "[A-Za-z]" "[^A-Za-z]"
Packit 2be50e
    "[\".,;:]" t ("-B" "-S" "-d" "nynorsk") "~ascii" iso-8859-1)
Packit 2be50e
    ("nynorsk7-iso246" "[][A-Za-z{}|\\]" "[^][A-Za-z{}|\\]"
Packit 2be50e
     "[\".,;:]" nil ("-B" "-S" "-d" "nynorsk")  "~iso246" iso-8859-1)
Packit 2be50e
   ("nynorsk-comp"                        ; 8 bit Norwegian mode
Packit 2be50e
    "[A-Za-z\305\306\307\310\311\322\323\324\330\345\346\347\350\351\362\363\364\370]"
Packit 2be50e
    "[^A-Za-z\305\306\307\310\311\322\323\324\330\345\346\347\350\351\362\363\364\370]"
Packit 2be50e
    "[\".,;:]" t ("-S" "-d" "nynorsk") "~list" iso-8859-1)
Packit 2be50e
   ("nynorsk7-tex-comp"                   ; 7 bit Norwegian TeX mode
Packit 2be50e
    "[A-Za-z{}\\'^`@]" "[^A-Za-z{}\\'^`@]"
Packit 2be50e
    "[\".,;:]" t ("-S" "-d" "nynorsk" "-T" "plaintex") "~plaintex" nil)
Packit 2be50e
   ("nynorsk7-html-comp"                  ; 7 bit Norwegian html mode
Packit 2be50e
    "[A-Za-z\&;]" "[^A-Za-z\&;]"        ; Don't use ispell's html-parser
Packit 2be50e
    "[.,:]" t ("-S" "-n" "-d" "nynorsk") "~html" iso-8859-1)
Packit 2be50e
   ("nynorsk7-ascii-comp"                 ; 7 bit Norwegian (aa, ae, oe)
Packit 2be50e
    "[A-Za-z]" "[^A-Za-z]"
Packit 2be50e
    "[\".,;:]" t ("-S" "-d" "nynorsk") "~ascii" iso-8859-1)
Packit 2be50e
    ("nynorsk7-iso246" "[][A-Za-z{}|\\]" "[^][A-Za-z{}|\\]"
Packit 2be50e
     "[\".,;:]" nil ("-B" "-S" "-d" "nynorsk")  "~iso246" iso-8859-1)
Packit 2be50e
   ))
Packit 2be50e
Packit 2be50e
(load-library "ispell")
Packit 2be50e
Packit 2be50e
The above is very unpretty indeed.  It is basically four copies of the
Packit 2be50e
same list.  If you come up with something better, please let me know.
Packit 2be50e
I am a terrible lisp programmer!
Packit 2be50e
Packit 2be50e
As you see there are a lot of entries.  The -comp entries puts ispell
Packit 2be50e
in controlled compoundwords mode.  Nice to do for a quick spell-check.
Packit 2be50e
I recommend to delete the entries you you don't plan to use.  I like
Packit 2be50e
to use the -S switch, e.g. not sort the suggestions made by ispell.
Packit 2be50e
Then it is more likely that the correct suggestion will be early in
Packit 2be50e
the list.
Packit 2be50e
Packit 2be50e
In the future I hope that ispell will be able to sort the suggestions
Packit 2be50e
it makes by commonness, at least for the most common words.  That
Packit 2be50e
should not be too difficult to implement.  Just load the most common
Packit 2be50e
words and their frequency indicator into memory, and do the nessesary
Packit 2be50e
lookups.  Or use the external look program.  Suggestions and
Packit 2be50e
implementations are most welcome!
Packit 2be50e
Packit 2be50e
There is also a file flyspell.el around.  This also offers
Packit 2be50e
spell-checking on the fly, and the interface is more like m$-word.
Packit 2be50e
Flyspell-mode highlights incorrect words, and you can even click on
Packit 2be50e
them to get suggestions for correct spelling.  Being able to sort on
Packit 2be50e
commonness would make flyspell's auto-correction mode much more
Packit 2be50e
useful!
Packit 2be50e
Packit 2be50e
Packit 2be50e
USING ISPELL IN BATCH MODE
Packit 2be50e
Packit 2be50e
I find ispell's batch mode very useful.  The command
Packit 2be50e
Packit 2be50e
cat myfile.tex | ispell -l -d norsk | sort | uniq -c | sort -n -r -s
Packit 2be50e
Packit 2be50e
prints all words in myfile.tex that is not in the Norwegian
Packit 2be50e
dictionary, where the most common words comes first.  Nice to spot
Packit 2be50e
errors, or as a starting point for a local dictionary.
Packit 2be50e
Packit 2be50e
Packit 2be50e
HYPHENATION IN TEX
Packit 2be50e
Packit 2be50e
Two sets of hyphenation patterns for the Norwegian language are
Packit 2be50e
provided.  The file norskb.tex hyphenates almost as TeX used to, and
Packit 2be50e
the file nohyphbc.tex only splits compound words.
Packit 2be50e
Packit 2be50e
It is fairly easy to install the nohyphb.tex file.  Just put it where
Packit 2be50e
TeX can find it, edit the file language.dat to point to the correct
Packit 2be50e
file, and remake the formats.  If you use teTeX you just say texconfig
Packit 2be50e
init.
Packit 2be50e
Packit 2be50e
If you want to install both sets of patterns, you have a TeX capacity
Packit 2be50e
problem.  The variable ssup_tree_size needs to be bigger than 65535
Packit 2be50e
and trie_op_size bigger than 1501.  I use 262142 and 3501.  So you
Packit 2be50e
need to change tex.ch (and omega.ch) and recompile TeX.  If you are
Packit 2be50e
using teTeX that should be quite easy.  Here is a patch:
Packit 2be50e
Packit 2be50e
*** tex.ch~     Fri Jan 21 23:13:24 2000
Packit 2be50e
--- tex.ch      Mon Jul 10 18:46:15 2000
Packit 2be50e
***************
Packit 2be50e
*** 196 ****
Packit 2be50e
! @d ssup_trie_size == 65535
Packit 2be50e
--- 196 ----
Packit 2be50e
! @d ssup_trie_size == 262143
Packit 2be50e
***************
Packit 2be50e
*** 215 ****
Packit 2be50e
! @!trie_op_size=1501; {space for ``opcodes'' in the hyphenation patterns;
Packit 2be50e
--- 215 ----
Packit 2be50e
! @!trie_op_size=3501; {space for ``opcodes'' in the hyphenation patterns;
Packit 2be50e
***************
Packit 2be50e
*** 217 ****
Packit 2be50e
! @!neg_trie_op_size=-1501; {for lower |trie_op_hash| array bound;
Packit 2be50e
--- 217 ----
Packit 2be50e
! @!neg_trie_op_size=-3501; {for lower |trie_op_hash| array bound;
Packit 2be50e
Packit 2be50e
*** omega.ch~  Thu Jul 13 11:37:08 2000
Packit 2be50e
--- omega.ch   Sun Jul 23 20:38:03 2000
Packit 2be50e
***************
Packit 2be50e
*** 125,127 ****
Packit 2be50e
  @d ssup_trie_opcode == 65535
Packit 2be50e
! @d ssup_trie_size == 100000
Packit 2be50e
  
Packit 2be50e
--- 125,127 ----
Packit 2be50e
  @d ssup_trie_opcode == 65535
Packit 2be50e
! @d ssup_trie_size == 262143
Packit 2be50e
  
Packit 2be50e
***************
Packit 2be50e
*** 139,143 ****
Packit 2be50e
    {Use |hash_offset=0| for compilers which cannot decrement pointers.}
Packit 2be50e
! @!trie_op_size=1501; {space for ``opcodes'' in the hyphenation patterns;
Packit 2be50e
    best if relatively prime to 313, 361, and 1009.}
Packit 2be50e
! @!neg_trie_op_size=-1501; {for lower |trie_op_hash| array bound;
Packit 2be50e
    must be equal to |-trie_op_size|.}
Packit 2be50e
--- 139,143 ----
Packit 2be50e
    {Use |hash_offset=0| for compilers which cannot decrement pointers.}
Packit 2be50e
! @!trie_op_size=3501; {space for ``opcodes'' in the hyphenation patterns;
Packit 2be50e
    best if relatively prime to 313, 361, and 1009.}
Packit 2be50e
! @!neg_trie_op_size=-3501; {for lower |trie_op_hash| array bound;
Packit 2be50e
    must be equal to |-trie_op_size|.}
Packit 2be50e
Packit 2be50e
Packit 2be50e
The easiest way to use the norskbc patterns is to define the macros
Packit 2be50e
Packit 2be50e
\def\goodhyphens{\lefthyphenmin2\righthyphenmin2\language=\l@norskc}
Packit 2be50e
\def\allhyphens{\lefthyphenmin1\righthyphenmin2\language=\l@norsk}
Packit 2be50e
Packit 2be50e
and change whenever you want to. A better solution might be to define
Packit 2be50e
norskc as another language in the Babel system anf use the Babel
Packit 2be50e
language switching system.
Packit 2be50e
Packit 2be50e
Packit 2be50e
MAKING IT PERFECT
Packit 2be50e
Packit 2be50e
So you have installed these great new patterns.  But TeX still might
Packit 2be50e
fail on Norwegian words not in the dictionary, so if you don't feel
Packit 2be50e
particularly lucky you will have to do something about that too.
Packit 2be50e
Packit 2be50e
There are two strategies.  I tend to prefer the second one.
Packit 2be50e
Packit 2be50e
1. Mark the compound point in the compound word with "-, e.g.
Packit 2be50e
   administrasjons"-sjef"-stillings-"søker.  If you have patched
Packit 2be50e
   ispell, you can do this during spell-checking most of the time.
Packit 2be50e
   
Packit 2be50e
2. Use the script inorsk-hyphenmaybe to print every word in your
Packit 2be50e
   document not in the dictionary (nynorsk and bokmål) hyphenated by
Packit 2be50e
   TeX.  Then you can easily browse through this list and put the
Packit 2be50e
   badly hyphenated words in a \hyphenation command.  The next time
Packit 2be50e
   you run the script it should produce correct hyphenation.
Packit 2be50e
   
Packit 2be50e
   For example if inorsk-hyphenmaybe outputs `kon-flik-t-akse' and
Packit 2be50e
   `kon-flik-t-ak-sen' you have to say \hyphenation{kon-flikt-akse
Packit 2be50e
   `kon-flikt-ak-sen'} in your TeX document.
Packit 2be50e
Packit 2be50e
But we are not done with hyphenation yet.  Have you ever considered
Packit 2be50e
the problem of hyphenating the word `villede' in TeX.  Of course you
Packit 2be50e
have.  The hyphenation should be `vill-lede', thus an extra `l' should
Packit 2be50e
be added.
Packit 2be50e
Packit 2be50e
Most languages which have such hyphenation (in particular German, with
Packit 2be50e
ss) support this in Babel.  The convention is that you code villede as
Packit 2be50e
vi"llede.  Of course the Norwegian dictionary supports this. Babel-3.7
Packit 2be50e
will also support this for Norwegian.  Till then you can use the file
Packit 2be50e
norsk.cfg to get this functionality (and some special hyphen points in
Packit 2be50e
addition).  The file itself offers more information.
Packit 2be50e
Packit 2be50e
Packit 2be50e
THE FUTURE OF HYPHENATION IN TEX
Packit 2be50e
Packit 2be50e
In standard TeX today it is not possible to say that one hyphen point
Packit 2be50e
is better than another, e.g. I like barnehage-assistent better than
Packit 2be50e
barne-hageassistent.  In the future TeX will be able to handle
Packit 2be50e
multiple classes of hyphens and different penalties can be assigned to
Packit 2be50e
each class.  Mathias Clasen has implemented this as a change file,
Packit 2be50e
but it has not made it into the standard distributions yet.  The stuff
Packit 2be50e
at the end of the patterns/Makefile is about generating hyphenation
Packit 2be50e
patterns for such a TeX.
Packit 2be50e
Packit 2be50e
Packit 2be50e
LETS MAKE THE DICTIONARY EVEN BETTER!
Packit 2be50e
Packit 2be50e
In the future I would like to add more word categories to the
Packit 2be50e
dictionary.  If you have a lot of text from within one field of
Packit 2be50e
knowledge, and would like to help, you can start by saying
Packit 2be50e
Packit 2be50e
cat allmytextfiles | inorsk-hyphenmaybe -e -p norskbc > mywords
Packit 2be50e
Packit 2be50e
You should install the hyphenation patterns norskbc for Norwegian to
Packit 2be50e
get hyphenation only at compound points, and of course the full
Packit 2be50e
dictionary with no words filtered out.
Packit 2be50e
Packit 2be50e
You will probably spot some new words, some of your own spelling
Packit 2be50e
errors and some hyphenation errors.  Fix that file, add flags defined
Packit 2be50e
in the affix file etc.
Packit 2be50e
Packit 2be50e
Next you have to learn to use the munchlist program.  Suppose you have
Packit 2be50e
the words in the file mywords
Packit 2be50e
Packit 2be50e
gjennom-strømnings-mekanisme
Packit 2be50e
gjennom-strømnings-mekanismen
Packit 2be50e
gjennom-strømnings-mekanismens
Packit 2be50e
gjennom-strømnings-mekanismer
Packit 2be50e
gjennom-strømnings-mekanismene
Packit 2be50e
Packit 2be50e
cat mywords \
Packit 2be50e
  | tr '-' 'Î' \
Packit 2be50e
  | munchlist -v -l norsk.aff.munch \
Packit 2be50e
  | tr 'Î' '-'
Packit 2be50e
Packit 2be50e
the output should be
Packit 2be50e
Packit 2be50e
gjennom-strømnings-mekanisme/AEG
Packit 2be50e
Packit 2be50e
which represents these five words.  (Of course this only work if
Packit 2be50e
ispell and munchlist is correctly installed.)
Packit 2be50e
Packit 2be50e
Here is some elisp stuff I have used (provided as is, probably very badly coded):
Packit 2be50e
Packit 2be50e
(defun ispell-expand-affixes () (interactive)
Packit 2be50e
  (shell-command-on-region (mark) (point) "sed -e \"s/[-0-9 	:]//g\" | ispell -e -d norsk"))
Packit 2be50e
Packit 2be50e
(defun ispell-collect-affixes () (interactive)
Packit 2be50e
  (shell-command (concat
Packit 2be50e
		  "echo \"" (buffer-substring-no-properties (mark) (point))
Packit 2be50e
		  "\" | sed -e \"s/-/î/g\" -e \"s/[0-9 	:]//g\" | "
Packit 2be50e
		  "munchlist -l norsk.aff.munch | sed -e \"s/î/-/g\" &")))
Packit 2be50e
Packit 2be50e
(defun ispell-expand-line () (interactive)
Packit 2be50e
  (save-excursion
Packit 2be50e
  (beginning-of-line)
Packit 2be50e
  (let ((beg (point)))
Packit 2be50e
  (end-of-line)
Packit 2be50e
  (let ((end (point))))
Packit 2be50e
  (shell-command-on-region beg (point) "sed -e \"s/[-0-9 	:]//g\" | ispell -d norsk -e"))))
Packit 2be50e
Packit 2be50e
; We have to quote the `' characters to protect them from shell
Packit 2be50e
; expansion.
Packit 2be50e
Packit 2be50e
(defun current-line ()
Packit 2be50e
  (save-excursion
Packit 2be50e
    (beginning-of-line)
Packit 2be50e
    (let ((beg (point)))
Packit 2be50e
      (end-of-line)
Packit 2be50e
      (let ((end (point)))
Packit 2be50e
	(setq myvar (buffer-substring-no-properties beg end))
Packit 2be50e
	(while (string-match " .*" myvar)
Packit 2be50e
		 (setq myvar (replace-match "" nil nil myvar)))
Packit 2be50e
	(while (string-match "\\([^\\]\\)\\([`'\"]\\|\\\\$\\)" myvar)
Packit 2be50e
		 (setq myvar (replace-match "\\1\\\\\\2" nil nil myvar)))
Packit 2be50e
	(while (string-match "[0-9 \t:.*]" myvar)
Packit 2be50e
		 (setq myvar (replace-match "" nil nil myvar)))
Packit 2be50e
	myvar))))
Packit 2be50e
Packit 2be50e
(defun current-region ()
Packit 2be50e
  (setq myvar (buffer-substring-no-properties (mark) (point)))
Packit 2be50e
  (while (string-match "\\([^\\]\\)\\([`'\"]\\|\\\\$\\)" myvar)
Packit 2be50e
    (setq myvar (replace-match "\\1\\\\\\2" nil nil myvar)))
Packit 2be50e
  (while (string-match "[0-9 \t]" myvar)
Packit 2be50e
    (setq myvar (replace-match "" nil nil myvar)))
Packit 2be50e
  myvar)
Packit 2be50e
Packit 2be50e