Trie as you Might
A trie is a data structure that makes it easy to search for words or prefixes in a corpus. The structure of the trie reduces the complexity of searching for a word or a prefix down from O(n)
where n
is the number of words in the dictionary to O(m)
where m
is the length of the query [1]. Pretty sweet, huh?
So when I ran across a leetcode question which needed a trie, I realized that I’d never actually written one before. Hence this blog post. Before I get into my implementation I want to give a huge shout out to Shubhadeep Roychowdhury who wrote an awesome blog post on the trie structure. I based a lot of my code on his implementation of the trie.
With that being said, I built a few additions to Roychowdhury’s trie. First of all there is a Trie
class itself which contains TrieNode
‘s. I also built out the __contains__
magic method so you can search within your trie using the in
keyword. So here we go:
class TrieNode: ''' Simple TrieNode, contains all we need for the larger Trie data structure. ''' def __init__(self, char: str): self.char = char self.children = [] self.word_complete = False self.counter = 1 class Trie: def __init__(self): self.root = TrieNode(None) def __contains__(self, word: str): ''' Allow us to use the `in` keyword with the Trie to see if a 'word' is `in` the Trie. ''' node = self.root for char in word: found_in_children = False for child in node.children: if child.char == char: found_in_children = True node = child break if not found_in_children: return False return node.word_complete is True def add(self, word: str): ''' Add a word to the Trie. ''' node = self.root for char in word: found_in_children = False for child in node.children: if child.char == char: child.counter += 1 node = child found_in_children = True break if not found_in_children: new_node = TrieNode(char) node.children.append(new_node) node = new_node node.word_complete = True def find_prefix(self, prefix: str) -> (bool, int): ''' Return True with the number of occurrances a given prefix occurs if the prefix is in the Trie. Otherwise return False. ''' node = self.root if not node.children: return False, 0 for char in prefix: char_not_found = True for child in node.children: if child.char == char: char_not_found = False node = child break if char_not_found: return False, 0 return True, node.counter if __name__ == '__main__': t = Trie() t.add(1) t.add('hackathon') t.add('hack') assert t.find_prefix('hac') == (True, 2) assert t.find_prefix('hack') == (True, 2) assert t.find_prefix('hackathon') == (True, 1) assert t.find_prefix('ha') == (True, 2) assert t.find_prefix('hammer') == (False, 0) assert ('hack' in t) is True assert ('hackathon' in t) is True assert ('hac' in t) is False print('ALL TESTS PASSED')
Side note, see those type hints?! Apparently in python 3.5 they added them as ways to help indicate the types that are expected (and returned) from functions. Python is still totally dynamically typed, but they serve as hints to the reader.
Anyway, the code is pretty self explanatory but if you have any questions please leave them in a comment below. Also, if you use this code note that it is not battle tested, so act accordingly.