Trie in Ruby

Firstly a “trie” is pronounced tree, or in some circles “try”.

It’s a particularly interesting data structure and one overlooked. To put it into context, it is a tree, and known by a few other names, digital tree, radix or as I like to consider it, prefix tree.

Definition:
It is an ordered tree structure that is used to store an associative array. An associative array in Ruby is simply a hash or {} or { ‘friend’ = ‘andre’ }. According to NIST it is a tree for storing strings in which there is one node for every common prefix.

The benefit of a Trie which will give you the right mental picture is that of a ‘router’ for an MVC web framework. A router applies regular expression style matching to find the correct endpoint to route the request too.

What Does A Trie Structure Do?

So let’s being by inserting the word ‘ace’. This will result in the following tree below.


Now this is our base – the “prefix”. Let’s see what the picture looks like if we insert another word, this time ‘acer’. Immediately you can see there is no duplication, but a clear recognition of an already existing prefix and then an insert which now results in a new leaf node “r”.

 

To push our understanding even further, consider if we had to add the word ‘aces’. What do you think the Trie would look like?

Excellent, as you can see it detected a prior node, and created a new leaf node on the right.
This is basically a Trie. One can immediately see the efficiency here, that is there are three words which consist of 11 characters, of which we only store 5.

Again as always the rspec tests with accompanying code.

Rspec

# Pronounced "tree"
require 'byebug'

describe Trees::Trie do
  before :each do
    @trie = Trees::Trie.new
  end

  context 'initialization' do
    it 'should initialize with a new TrieNode' do
      expect(@trie.root.class).to eq Trees::Trie::TrieNode
    end

    it 'should initialize with a property isLeaf' do
      expect(@trie.root.respond_to?(:is_leaf)).to eq true
    end

    it 'should intialize with a property children' do
      expect(@trie.root.respond_to?(:children)).to eq true
    end

    it 'isLeaf should be boolean' do
      expect(@trie.root.is_leaf).to eq false
    end

    it 'children should be a hash' do
      expect(@trie.root.children.class).to eq Hash
    end
  end

  context 'insertion' do
    before :each do
      @trie.insert('ace') # initial prefix
      @trie.insert('acer')
      @trie.insert('aces')
      @trie.insert('actor')
    end
    it 'should insert a word' do
      # (a)
      #  |
      # (c)
      #  |
      # (e)
      expect(@trie.root.children.keys[0]).to eq 'a'
      expect(@trie.root.children['a'].children.keys[0]).to eq 'c'
      expect(@trie.root.children['a'].children['c'].children.keys[0]).to eq 'e'
    end

    it 'should only append if word already contains prefix' do
      #   (a)
      #    |
      #   (c)
      #    |
      #   (e)
      #   /
      # (r)
      expect(@trie.root.children['a'].children['c'].children['e'].children.keys[0]).to eq 'r'
    end

    it 'should add a second key if there is an existing key at same height' do
      #   (a)
      #    |
      #   (c)
      #    |
      #   (e)
      #   / \
      # (r) (s)
      expect(@trie.root.children['a'].children['c'].children['e'].children.keys[1]).to eq 's'
    end

    it 'should add a new word only where the prefix matches' do
      #   (a)
      #    |
      #   (c)
      #    |      \
      #   (e)    (t)
      #   / \     |
      # (r) (s)  (o)
      #           |
      #          (r)
      expect(@trie.root.children['a'].children['c'].children['t'].children['o'].children.keys[0]).to eq 'r'
    end
  end
end

Code:

module Trees
  class Trie

    attr_accessor :root

    class TrieNode
      attr_accessor :is_leaf, :children

      def initialize
        @children = Hash.new
        @is_leaf = false
      end
    end

    def initialize
      @root = TrieNode.new
    end

    def insert(word)
      children = @root.children

      word.split('').each_with_index  do |char, index|
        trienode = TrieNode.new
        if children.has_key?(char)
          trienode = children[char]
        else
          trienode = TrieNode.new
          children[char] = trienode
        end

        children = trienode.children
        trienode.is_leaf = true if (index == (word.length - 1))
      end
    end
  end
end

 

All code can be found here.

References:

Leave a Reply

Your email address will not be published. Required fields are marked *