Rambling Trie: A custom Trie data structure implementation in Ruby
February 10, 2012
This post was originally published in the Rambling Labs Blog on February 10, 2012.
We’re proud to announce that our first gem is here!
rambling-trie, and it’s an implementation of the Trie data structure. A Trie is an ordered tree data structure, and it is great for managing dictionaries and for word searching and matching.
Now, to install the
rambling-trie, you can either install it manually by running:
gem install rambling-trie
Or, you can add this line to the Gemfile of your application:
To create a new Trie you can do this:
require 'rambling-trie' trie = Rambling::Trie.new
And to add a new word to the Trie, you can call this:
Or to automatically load words from a file, you can do this:
trie = Rambling::Trie.new '/path/to/dictionary'
Additionally, there’s a
compress! method which use a redundant node technique to compress the Trie and that ends up saving a lot of memory.
Implementing this data structure has helped us to learn a lot about performance and memory boosting on Ruby. The first drafts (version 0.0.1 and 0.0.2) came to life last Saturday, when we needed an efficient way to store and traverse a dictionary to search for valid words. In this stage, a 175K word dictionary was taking about 200MB on ruby 1.9.2 and 220MB on ruby 1.9.3. A lot of memory if you ask me.
Version 0.1.0 was a simple API change, and then came version 0.2.0 with the memory and performance boosting. We removed some unnecessary instance variables and added the
compress! method and got to these results for the same 175K word dictionary: 134MB on both ruby 1.9.2 and ruby 1.9.3.
rambling-trie has been thoroughly developed with BDD using RSpec.
It’s on a fairly stable stage right now on version 0.3.0, so you can start using it right away! Enjoy!