HeadlinesBriefing favicon HeadlinesBriefing.com

Binary GCD Algorithm Outperforms Standard C++ by 2x

Hacker News •
×

The binary GCD algorithm, rediscovered by Josef Stein in 1967, delivers approximately 2x performance improvement over the standard C++17 `std::gcd` implementation. While Euclid's algorithm relies on modulo operations that can take hundreds of CPU cycles, the binary variant uses only binary shifts, comparisons, and subtractions—operations that typically complete in a single cycle. This makes it particularly valuable for performance-critical applications.

Unlike the textbook-friendly recursive implementations, the binary GCD requires more complex branching logic to handle different parity cases of the input numbers. Initial implementations performed poorly due to excessive branching, running slower than the standard library version. However, optimization techniques including pre-shifting to remove common factors of two and eliminating redundant parity checks transformed its performance profile.

By leveraging the `__builtin_ctz` instruction for counting trailing zeros and restructuring the algorithm to minimize conditional branches, the optimized implementation achieves 116 nanoseconds versus 198 nanoseconds for `std::gcd`. The key insight is that after handling the initial even-even case, the algorithm can be restructured to alternate between two states, removing the need for most conditional branches. This demonstrates how understanding both mathematical properties and hardware characteristics can lead to significant performance gains in fundamental algorithms.