HeadlinesBriefing favicon HeadlinesBriefing.com

IBM's Cassandra Architect: Trie Revolution

Hacker News •
×

Branimir Lambov, a Cassandra committer at IBM, brings an unusual background to distributed databases. With a PhD in exact computation and prior work in digital signal processing and NLP, he transitioned to database work over a decade ago. His most significant contribution is the Trie implementation in Cassandra 5, which replaces Skiplist in the Log-Structured Merge Tree (LSM Tree), improving memory usage and storage efficiency.

The Trie project represents years of development, starting as a proof-of-concept for byte-order advantages in Cassandra. Lambov developed the trie-indexed bigtable format (BTI) working largely solo, which later became the default SSTable format in DataStax's DSE 6. His broader effort through CEP-57 aims to apply trie-based approaches throughout Cassandra's storage mechanisms, with the trie memtable in Cassandra 5 being the first result.

Despite Cassandra's reputation for stability, Lambov recounts an incident where a feature he developed caused data loss after an assertion was disabled in production. This highlights the challenges of maintaining complex distributed systems. His work on compaction strategies and token allocation demonstrates both successful innovations and lessons learned, showing the iterative nature of database development even at established companies like IBM.