|
Packit |
de3218 |
- Project name
|
|
Packit |
de3218 |
|
|
Packit |
de3218 |
marisa-trie
|
|
Packit |
de3218 |
http://code.google.com/p/marisa-trie/
|
|
Packit |
de3218 |
|
|
Packit |
de3218 |
- Project summary
|
|
Packit |
de3218 |
|
|
Packit |
de3218 |
MARISA: Matching Algorithm with Recursively Implemented StorAge
|
|
Packit |
de3218 |
|
|
Packit |
de3218 |
- Version
|
|
Packit |
de3218 |
|
|
Packit |
de3218 |
0.2.4
|
|
Packit |
de3218 |
|
|
Packit |
de3218 |
- Description
|
|
Packit |
de3218 |
|
|
Packit |
de3218 |
*Matching Algorithm with Recursively Implemented !StorAge (MARISA)* is a static and space-efficient trie data structure. And *libmarisa* is a C++ library to provide an implementation of MARISA. Also, the package of *libmarisa* contains a set of command line tools for building and operating a MARISA-based dictionary.
|
|
Packit |
de3218 |
|
|
Packit |
de3218 |
A MARISA-based dictionary supports not only lookup but also reverse lookup, common prefix search and predictive search.
|
|
Packit |
de3218 |
|
|
Packit |
de3218 |
* Lookup is to check whether or not a given string exists in a dictionary.
|
|
Packit |
de3218 |
* Reverse lookup is to restore a key from its ID.
|
|
Packit |
de3218 |
* Common prefix search is to find keys from prefixes of a given string.
|
|
Packit |
de3218 |
* Predictive search is to find keys starting with a given string.
|
|
Packit |
de3218 |
|
|
Packit |
de3218 |
The biggest advantage of *libmarisa* is that its dictionary size is considerably more compact than others. See below for the dictionary size of other implementations.
|
|
Packit |
de3218 |
|
|
Packit |
de3218 |
* Input
|
|
Packit |
de3218 |
* Source: enwiki-20121101-all-titles-in-ns0.gz
|
|
Packit |
de3218 |
* Contents: all page titles of English Wikipedia (Nov. 2012)
|
|
Packit |
de3218 |
* Number of keys: 9,805,576
|
|
Packit |
de3218 |
* Total size: 200,435,403 bytes (plain) / 54,933,690 bytes (gzipped)
|
|
Packit |
de3218 |
|
|
Packit |
de3218 |
|| *Implementation* || *Size (bytes)* || *Remarks* ||
|
|
Packit |
de3218 |
|| darts-clone || 376,613,888 || Compacted double-array trie ||
|
|
Packit |
de3218 |
|| tx-trie || 127,727,058 || LOUDS-based trie ||
|
|
Packit |
de3218 |
|| *marisa-trie* || 50,753,560 || MARISA trie ||
|
|
Packit |
de3218 |
|
|
Packit |
de3218 |
* Documentation
|
|
Packit |
de3218 |
* marisa-0.2.4
|
|
Packit |
de3218 |
* [http://marisa-trie.googlecode.com/svn/trunk/docs/readme.en.html README (English)]
|
|
Packit |
de3218 |
* [http://marisa-trie.googlecode.com/svn/trunk/docs/readme.ja.html README (Japanese)]
|
|
Packit |
de3218 |
* marisa-0.1.5 (Japanese)
|
|
Packit |
de3218 |
* HowTo
|
|
Packit |
de3218 |
* ListOfTools
|
|
Packit |
de3218 |
* LibraryInterface
|
|
Packit |
de3218 |
* BenchmarkResults
|
|
Packit |
de3218 |
|
|
Packit |
de3218 |
- Version control system
|
|
Packit |
de3218 |
|
|
Packit |
de3218 |
Subversion
|
|
Packit |
de3218 |
|
|
Packit |
de3218 |
- Source code license
|
|
Packit |
de3218 |
|
|
Packit |
de3218 |
The BSD 2-clause License
|
|
Packit |
de3218 |
The LGPL 2.1 or any later version
|
|
Packit |
de3218 |
|
|
Packit |
de3218 |
- Project labels
|
|
Packit |
de3218 |
|
|
Packit |
de3218 |
Patricia
|
|
Packit |
de3218 |
Trie
|
|
Packit |
de3218 |
Static
|
|
Packit |
de3218 |
Dictionary
|
|
Packit |
de3218 |
CPlusPlus
|