Rambling Trie: A custom Trie data structure implementation in Ruby

February 10, 2012

This post was originally published in the Rambling Labs Blog on February 10, 2012.

We’re proud to announce that our first gem is here!

It’s called rambling-trie, and it’s an implementation of the Trie data structure. A Trie is an ordered tree data structure, and it is great for managing dictionaries and for word searching and matching.

Now, to install the rambling-trie, you can either install it manually by running:

gem install rambling-trie

Or, you can add this line to the Gemfile of your application:

gem 'rambling-trie'

To create a new Trie you can do this:

require 'rambling-trie'

trie = Rambling::Trie.new

And to add a new word to the Trie, you can call this:

trie.add_branch_from 'word'

Or to automatically load words from a file, you can do this:

trie = Rambling::Trie.new '/path/to/dictionary'

Additionally, there’s a compress! method which use a redundant node technique to compress the Trie and that ends up saving a lot of memory.

You can find more information on how to use it on the project’s readme on GitHub. If you find any bugs or want any features, go ahead an submit them to the project’s issues on GitHub.

Implementing this data structure has helped us to learn a lot about performance and memory boosting on Ruby. The first drafts (version 0.0.1 and 0.0.2) came to life last Saturday, when we needed an efficient way to store and traverse a dictionary to search for valid words. In this stage, a 175K word dictionary was taking about 200MB on ruby 1.9.2 and 220MB on ruby 1.9.3. A lot of memory if you ask me.

Version 0.1.0 was a simple API change, and then came version 0.2.0 with the memory and performance boosting. We removed some unnecessary instance variables and added the compress! method and got to these results for the same 175K word dictionary: 134MB on both ruby 1.9.2 and ruby 1.9.3.

The rambling-trie has been thoroughly developed with BDD using RSpec.

It’s on a fairly stable stage right now on version 0.3.0, so you can start using it right away! Enjoy!