HeadlinesBriefing favicon HeadlinesBriefing.com

Skiplists Solve BigQuery Tree Traversal Challenges

Hacker News •
×

Skiplists offer a randomized alternative to binary search trees with the same interface and complexity. Antithesis developers discovered these structures weren't just academic when they needed a solution for tree traversal in their testing platform. The company's encounter with this niche data structure revealed practical applications beyond its theoretical appeal, leading to an innovative solution for their analytic database challenges.

At Antithesis, analyzing software bugs required navigating massive timeline trees in Google BigQuery. The platform's optimization for parallel scans made point lookups prohibitively expensive, as each tree traversal operation would scan the entire dataset. Traditional approaches would require splitting data across systems, introducing consistency challenges and complex transaction management that the team wanted to avoid.

The solution involved creating a "skiptree" - a hierarchy of tables where each level contained approximately 50% of the nodes from the level below. This structure allowed ancestor lookups through fixed SQL JOINs, avoiding recursive queries. Though the SQL queries grew to kilobytes in size, their geometric distribution minimized data scanning costs. The approach powered Antithesis's testing evaluation for six years until the company developed its own analytic database.