|
Packit |
1d5bbe |
README Swahili Myspell Dictionary
|
|
Packit |
1d5bbe |
Release 1.1 2005-08-17
|
|
Packit |
1d5bbe |
|
|
Packit |
1d5bbe |
1. Intro
|
|
Packit |
1d5bbe |
|
|
Packit |
1d5bbe |
Myspell Swahili Dictionary - Compiled by Alberto Escudero-Pascual aep@it46.se
|
|
Packit |
1d5bbe |
http://www.it46.se
|
|
Packit |
1d5bbe |
|
|
Packit |
1d5bbe |
|
|
Packit |
1d5bbe |
2. Word list Sources
|
|
Packit |
1d5bbe |
|
|
Packit |
1d5bbe |
The wordlists have been compiled based on the following
|
|
Packit |
1d5bbe |
resources:
|
|
Packit |
1d5bbe |
|
|
Packit |
1d5bbe |
- Dr. Jason M. Githeko (githeko at egerton.ac.ke)
|
|
Packit |
1d5bbe |
Egerton University, Njoro, Kenya
|
|
Packit |
1d5bbe |
http://www.egerton.ac.ke/ict/kiswa.php
|
|
Packit |
1d5bbe |
(48340 words)
|
|
Packit |
1d5bbe |
|
|
Packit |
1d5bbe |
- Prof. D.P.B. Massamba, Prof. A.M. Khamisi et al.
|
|
Packit |
1d5bbe |
TUKI English-Swahili Dictionary
|
|
Packit |
1d5bbe |
(18327 words)
|
|
Packit |
1d5bbe |
|
|
Packit |
1d5bbe |
- Dr. Martin Benjamin et al. (swahili at yale.edu)
|
|
Packit |
1d5bbe |
The Kamusi Project,
|
|
Packit |
1d5bbe |
http://www.yale.edu/swahili/
|
|
Packit |
1d5bbe |
(15418 words)
|
|
Packit |
1d5bbe |
|
|
Packit |
1d5bbe |
- Dr. Kevin P. Scannell (scannell at slu.edu)
|
|
Packit |
1d5bbe |
Corpus building for minority languages
|
|
Packit |
1d5bbe |
http://borel.slu.edu/crubadan/
|
|
Packit |
1d5bbe |
(+8008 words)
|
|
Packit |
1d5bbe |
|
|
Packit |
1d5bbe |
Total words: 67901
|
|
Packit |
1d5bbe |
|
|
Packit |
1d5bbe |
In addition, the programming skills of the following persons
|
|
Packit |
1d5bbe |
have also contributed to the Jambo Spellchecker:
|
|
Packit |
1d5bbe |
Dwayne Bailey, Louise Berthilson, Iñaki Cívico Campos, Alberto
|
|
Packit |
1d5bbe |
Escudero-Pascual and Fredrik Lilieblad.
|
|
Packit |
1d5bbe |
|
|
Packit |
1d5bbe |
3. Licence
|
|
Packit |
1d5bbe |
|
|
Packit |
1d5bbe |
The Jambo Spellchecker is released as free software (LGPL).
|
|
Packit |
1d5bbe |
|
|
Packit |
1d5bbe |
4. Final Notes
|
|
Packit |
1d5bbe |
|
|
Packit |
1d5bbe |
- Kamusi Project wordlist:
|
|
Packit |
1d5bbe |
|
|
Packit |
1d5bbe |
The Kamusi Project is an ongoing work of collaborative
|
|
Packit |
1d5bbe |
scholarship that is developing a free online dictionary and
|
|
Packit |
1d5bbe |
learning resources for Swahili. Established in 1994, it is the
|
|
Packit |
1d5bbe |
world's most-used resource for the Swahili language, and the
|
|
Packit |
1d5bbe |
first result for "Swahili" delivered by most Internet search
|
|
Packit |
1d5bbe |
engines; see http://www.yale.edu/swahili/ for more
|
|
Packit |
1d5bbe |
information.
|
|
Packit |
1d5bbe |
|
|
Packit |
1d5bbe |
- An Crúbadán:
|
|
Packit |
1d5bbe |
|
|
Packit |
1d5bbe |
The Swahili word list was improved with the help of Kevin
|
|
Packit |
1d5bbe |
Scannell's software An Crúbadán, a web crawler that targets
|
|
Packit |
1d5bbe |
minority languages and languages with limited computational
|
|
Packit |
1d5bbe |
resources.
|
|
Packit |
1d5bbe |
|
|
Packit |
1d5bbe |
In December 2004, the web crawler searched into 6600+ online
|
|
Packit |
1d5bbe |
Swahili documents and collected about 10 million (non unique)
|
|
Packit |
1d5bbe |
words .
|
|
Packit |
1d5bbe |
|
|
Packit |
1d5bbe |
The goal of the An Crúbadán is to develop language technology
|
|
Packit |
1d5bbe |
for as many languages as possible by applying statistical
|
|
Packit |
1d5bbe |
techniques to the vast quantities of text freely available on
|
|
Packit |
1d5bbe |
the web. Text corpora have been created for nearly 200
|
|
Packit |
1d5bbe |
languages so far, and these data are available for use by open
|
|
Packit |
1d5bbe |
source projects; see http://borel.slu.edu/crubadan/ for more
|
|
Packit |
1d5bbe |
information.
|
|
Packit |
1d5bbe |
|
|
Packit |
1d5bbe |
5. TODO
|
|
Packit |
1d5bbe |
|
|
Packit |
1d5bbe |
Work in the .aff file
|