blob: 276ca3668fe5fd6d01afb78433c8e88c4a89a85b [file] [log] [blame]
- Project name
marisa-trie
http://code.google.com/p/marisa-trie/
- Project summary
Matching Algorithms with Recursively Implemented StorAge
- Version
0.1.4
- Description
This project *marisa-trie* provides a C++ library *libmarisa* and command line tools *`marisa-*`* for building and operating nesting patricia tries. The brand-new dictionary structure is designed to be static and space efficient. Also, *marisa-trie* enables not only simple lookups but also prefix searches and predictive searches.
* Prefix searches are to find keys from prefixes of a given query.
* Predictive searches are to find keys starting with a given query.
The biggest advantage of *marisa-trie* is that it can build a considerably compact dictionary. See below for the size of dictionaries built with various trie implementations.
* Input
* Source: enwiki-20110115-all-titles-in-ns0.gz
* Contents: all page titles of English Wikipedia (Jan. 2011)
* Number of keys: 8,279,325
* Total size: 167,223,035 bytes (plain) / 46,344,646 bytes (gzipped)
|| *Implementation* || *Size (bytes)* || *Remarks* ||
|| darts-clone || 316,065,792 || Compacted double-array trie ||
|| tx-trie || 107,119,864 || LOUDS-based trie ||
|| *marisa-trie* || *42,688,271* || Nesting patricia trie ||
* Documentation (in Japanese)
* HowTo
* ListOfTools
* LibraryInterface
* BenchmarkResults
- Version control system
Subversion
- Source code license
New BSD License
- Project labels
Nesting
Patricia
Trie
Static
Dictionary
CPlusPlus