lander/TRIE.md

701 B

Trie design

The underlying data structure is based on a combination of a ternary and a Patricia trie.

  • Nodes are classic ternary trie nodes, meaning each node contains a binary search tree
  • Each node can define a skip, like a Patricia trie, of at most 8 characters. These skipped characters are stored directly in the structs defining the nodes.
  • While the add function relies on the fact that the input is a NULL-terminated C string, the trie itself does not store any NULL bytes.

The goal of this datastructure is to be as optimized as possible for search operations with short (usually < 8 characters) keys, as this is by far the most common operation for a URL shortener/pastebin.