HeadlinesBriefing favicon HeadlinesBriefing.com

Why Ollama's Fork of llama.cpp Lost Trust

Hacker News •
×

Ollama rose to become the go‑to way to run local LLMs after wrapping Georgi Gerganov’s llama.cpp engine in a single‑command installer. Founded in 2021 by former Docker GUI creators Jeffrey Morgan and Michael Chiang, the startup marketed itself as “Docker for LLMs” and secured Y Combinator backing and pre‑seed venture money. Early users praised its ease, but the project soon began obscuring its reliance on llama.cpp.

The README omitted any llama.cpp credit for over a year, and binary releases lacked the required MIT license notice. Community issues filed in early 2024 sat unanswered for 400 days until a pull request forced a single line acknowledgment at the README’s bottom. Ollama later claimed it would “transition to more systematically built engines,” hinting at a planned departure from the original codebase.

In mid‑2025 Ollama replaced llama.cpp with a custom ggml‑based backend, citing enterprise stability. Benchmarks now show llama.cpp delivering roughly 1.8× higher throughput—161 versus 89 tokens per second on identical hardware—while Ollama’s GUI, released later that year, shipped as a closed‑source binary despite the project’s open‑source roots. The cumulative missteps have eroded trust, prompting many developers to abandon Ollama for the upstream engine.