|
Packit |
1f3717 |
Suomi-malaga - Voikko edition
|
|
Packit |
1f3717 |
=============================
|
|
Packit |
1f3717 |
|
|
Packit |
1f3717 |
General information
|
|
Packit |
1f3717 |
-------------------
|
|
Packit |
1f3717 |
|
|
Packit |
1f3717 |
Suomi-malaga is a description of Finnish morphology written in Malaga
|
|
Packit |
1f3717 |
(http://home.arcor.de/bjoern-beutel/malaga/). You should use malaga
|
|
Packit |
1f3717 |
version 7.8 or later.
|
|
Packit |
1f3717 |
|
|
Packit |
1f3717 |
Currently Suomi-malaga is used in two different applications: text
|
|
Packit |
1f3717 |
indexer Sukija and spellchecker/hyphenator Voikko. Version 1.0 and
|
|
Packit |
1f3717 |
later will work with both applications. This release creates
|
|
Packit |
1f3717 |
Voikko morphology with version 2 dictionary format.
|
|
Packit |
1f3717 |
|
|
Packit |
1f3717 |
All of the documentation about Finnish morphology is in Finnish (see
|
|
Packit |
1f3717 |
README.fi and subdirectory doc). This README contains only build
|
|
Packit |
1f3717 |
and usage instructions for distribution packagers.
|
|
Packit |
1f3717 |
|
|
Packit |
1f3717 |
|
|
Packit |
1f3717 |
Build and installation
|
|
Packit |
1f3717 |
----------------------
|
|
Packit |
1f3717 |
|
|
Packit |
1f3717 |
Building Suomi-malaga for from this package requires malaga, python
|
|
Packit |
1f3717 |
and make. No configuration is required: to build the code for Voikko,
|
|
Packit |
1f3717 |
you only need to run
|
|
Packit |
1f3717 |
make voikko
|
|
Packit |
1f3717 |
Installation can be done by running
|
|
Packit |
1f3717 |
make voikko-install DESTDIR=/usr/lib/voikko
|
|
Packit |
1f3717 |
(Replace /usr/lib/voikko with the directory you want to install the
|
|
Packit |
1f3717 |
files to. Installing to ~/.voikko will cause libvoikko to use this
|
|
Packit |
1f3717 |
version of Suomi-malaga only for the user who does the installation.)
|
|
Packit |
1f3717 |
Building the code for Sukija can be done by running
|
|
Packit |
1f3717 |
make sukija
|
|
Packit |
1f3717 |
|
|
Packit |
1f3717 |
Supported Make targets
|
|
Packit |
1f3717 |
----------------------
|
|
Packit |
1f3717 |
|
|
Packit |
1f3717 |
- sukija
|
|
Packit |
1f3717 |
Builds the binary files needed by text indexer Sukija.
|
|
Packit |
1f3717 |
- voikko
|
|
Packit |
1f3717 |
Builds the binary files needed by libvoikko.
|
|
Packit |
1f3717 |
- voikko-sukija
|
|
Packit |
1f3717 |
Builds the binary files needed by libvoikko with a big
|
|
Packit |
1f3717 |
dictionary that can be used by Sukija.
|
|
Packit |
1f3717 |
- voikko-install DESTDIR=/usr/lib/voikko
|
|
Packit |
1f3717 |
Installs the binary files needed by libvoikko to the directory
|
|
Packit |
1f3717 |
specified by DESTDIR. DESTDIR is optional and defaults to
|
|
Packit |
1f3717 |
/usr/lib/voikko
|
|
Packit |
1f3717 |
- dist-gzip
|
|
Packit |
1f3717 |
Builds the full source package.
|
|
Packit |
1f3717 |
- clean
|
|
Packit |
1f3717 |
Removes all files generated by other targets.
|
|
Packit |
1f3717 |
- update-vocabulary
|
|
Packit |
1f3717 |
Updates the XML vocabulary from the nightly snapshot at
|
|
Packit |
1f3717 |
joukahainen.puimula.org. This target requires wget to
|
|
Packit |
1f3717 |
be available.
|
|
Packit |
1f3717 |
- TAGS
|
|
Packit |
1f3717 |
Builds an Emacs tag table file from the vocabulary database.
|
|
Packit |
1f3717 |
|
|
Packit |
1f3717 |
|
|
Packit |
1f3717 |
Variables for tuning the build process
|
|
Packit |
1f3717 |
--------------------------------------
|
|
Packit |
1f3717 |
|
|
Packit |
1f3717 |
- make voikko:
|
|
Packit |
1f3717 |
* VOIKKO_BUILDDIR=path/to/directory
|
|
Packit |
1f3717 |
Specifies the directory where build files are written to while building
|
|
Packit |
1f3717 |
for Voikko.
|
|
Packit |
1f3717 |
Default: voikko (build within source directory)
|
|
Packit |
1f3717 |
* GENLEX_OPTS="--option1=xxx --option2=yyy ..."
|
|
Packit |
1f3717 |
Sets options string for generate_lex.py.
|
|
Packit |
1f3717 |
Available options for generate_lex.py are
|
|
Packit |
1f3717 |
+ --min-frequency=n
|
|
Packit |
1f3717 |
Limits the words to be included in the .lex files to the
|
|
Packit |
1f3717 |
specified or higher frequency class. Default is 9.
|
|
Packit |
1f3717 |
+ --extra-usage=usage1,usage2,...
|
|
Packit |
1f3717 |
If a word has usage flags (it belongs to a special vocabulary), it is
|
|
Packit |
1f3717 |
included in the vocabulary only if at least one of the usage flags is
|
|
Packit |
1f3717 |
listed here. Available usage flags are listed in file
|
|
Packit |
1f3717 |
vocabulary/flags.txt.
|
|
Packit |
1f3717 |
Listing "sukija" here causes application specific exclusions to be ignored
|
|
Packit |
1f3717 |
(words marked with not_voikko will also be included).
|
|
Packit |
1f3717 |
By default, no special vocabularies are included.
|
|
Packit |
1f3717 |
+ --style=style1,style2,...
|
|
Packit |
1f3717 |
If a word has style flags (such as old, foreign or dialect), it is
|
|
Packit |
1f3717 |
included in the vocabulary only if all of the style flags are listed
|
|
Packit |
1f3717 |
here. Available style flags are listed in file vocabulary/flags.txt.
|
|
Packit |
1f3717 |
Default: old,international,inappropriate
|
|
Packit |
1f3717 |
+ --sourceid
|
|
Packit |
1f3717 |
Insert word identifiers from Joukahainen to lexicon and return them
|
|
Packit |
1f3717 |
during morphological analysis. This option has no effect unless
|
|
Packit |
1f3717 |
VOIKKO_DEBUG=yes is set. By default source ids are not preserved.
|
|
Packit |
1f3717 |
* EXTRA_LEX="path/to/file1.lex path/to/file2.lex ..."
|
|
Packit |
1f3717 |
Adds extra malaga lexicon files to the vocabulary. By default, no extra
|
|
Packit |
1f3717 |
lexicons are added.
|
|
Packit |
1f3717 |
* VANHAHKOT_MUODOT=yes|no
|
|
Packit |
1f3717 |
See voikko/doc/liput.txt. Default: yes
|
|
Packit |
1f3717 |
* VANHAT_MUODOT=yes|no
|
|
Packit |
1f3717 |
See voikko/doc/liput.txt. Default: no
|
|
Packit |
1f3717 |
* SUKIJAN_MUODOT=yes|no
|
|
Packit |
1f3717 |
Include words that exist just for Sukija. Default: no
|
|
Packit |
1f3717 |
* VOIKKO_DEBUG=yes|no
|
|
Packit |
1f3717 |
Include information that is not needed by libvoikko but may be needed
|
|
Packit |
1f3717 |
for debugging or by external applications (full morphological analysis).
|
|
Packit |
1f3717 |
Default: no
|
|
Packit |
1f3717 |
* VOIKKO_VARIANT=variant
|
|
Packit |
1f3717 |
Set the short name for the language variant of this vocabulary. The
|
|
Packit |
1f3717 |
name should match the regular expression [a-z][a-z0-9_]*
|
|
Packit |
1f3717 |
Default: standard
|
|
Packit |
1f3717 |
* VOIKKO_DESCRIPTION="Description of the vocabulary"
|
|
Packit |
1f3717 |
Set the long description for the language variant of this vocabulary.
|
|
Packit |
1f3717 |
* SM_PATCHINFO="Information about applied patches"
|
|
Packit |
1f3717 |
If you have modified the source code or are distributing prerelease
|
|
Packit |
1f3717 |
versions, describe any modifications made to the released version here.
|
|
Packit |
1f3717 |
It may be best to change this directly in the Makefile.
|
|
Packit |
1f3717 |
|
|
Packit |
1f3717 |
|
|
Packit |
1f3717 |
Copyright and license information
|
|
Packit |
1f3717 |
---------------------------------
|
|
Packit |
1f3717 |
|
|
Packit |
1f3717 |
This program is free software; you can redistribute it and/or modify
|
|
Packit |
1f3717 |
it under the terms of the GNU General Public License as published by
|
|
Packit |
1f3717 |
the Free Software Foundation; either version 2, or (at your option)
|
|
Packit |
1f3717 |
any later version. See file COPYING for details.
|
|
Packit |
1f3717 |
|
|
Packit |
1f3717 |
Copyright (©) 2006 - 2015 Hannu Väisänen (Email: Hannu.Vaisanen@uef.fi)
|
|
Packit |
1f3717 |
and 2006 - 2015 Harri Pitkänen (hatapitk@iki.fi). Contributors listed
|
|
Packit |
1f3717 |
in file CONTRIBUTORS hold copyrights to the vocabulary data.
|