| - Project name |
| |
| marisa-trie |
| http://code.google.com/p/marisa-trie/ |
| |
| - Project summary |
| |
| MARISA: Matching Algorithm with Recursively Implemented StorAge |
| |
| - Version |
| |
| 0.2.4 |
| |
| - Description |
| |
| *Matching Algorithm with Recursively Implemented !StorAge (MARISA)* is a static and space-efficient trie data structure. And *libmarisa* is a C++ library to provide an implementation of MARISA. Also, the package of *libmarisa* contains a set of command line tools for building and operating a MARISA-based dictionary. |
| |
| A MARISA-based dictionary supports not only lookup but also reverse lookup, common prefix search and predictive search. |
| |
| * Lookup is to check whether or not a given string exists in a dictionary. |
| * Reverse lookup is to restore a key from its ID. |
| * Common prefix search is to find keys from prefixes of a given string. |
| * Predictive search is to find keys starting with a given string. |
| |
| The biggest advantage of *libmarisa* is that its dictionary size is considerably more compact than others. See below for the dictionary size of other implementations. |
| |
| * Input |
| * Source: enwiki-20121101-all-titles-in-ns0.gz |
| * Contents: all page titles of English Wikipedia (Nov. 2012) |
| * Number of keys: 9,805,576 |
| * Total size: 200,435,403 bytes (plain) / 54,933,690 bytes (gzipped) |
| |
| || *Implementation* || *Size (bytes)* || *Remarks* || |
| || darts-clone || 376,613,888 || Compacted double-array trie || |
| || tx-trie || 127,727,058 || LOUDS-based trie || |
| || *marisa-trie* || 50,753,560 || MARISA trie || |
| |
| * Documentation |
| * marisa-0.2.4 |
| * [http://marisa-trie.googlecode.com/svn/trunk/docs/readme.en.html README (English)] |
| * [http://marisa-trie.googlecode.com/svn/trunk/docs/readme.ja.html README (Japanese)] |
| * marisa-0.1.5 (Japanese) |
| * HowTo |
| * ListOfTools |
| * LibraryInterface |
| * BenchmarkResults |
| |
| - Version control system |
| |
| Subversion |
| |
| - Source code license |
| |
| The BSD 2-clause License |
| The LGPL 2.1 or any later version |
| |
| - Project labels |
| |
| Patricia |
| Trie |
| Static |
| Dictionary |
| CPlusPlus |