HeadlinesBriefing favicon HeadlinesBriefing.com

GPT-5.5 Hallucinates Three Times More Than Open-Source GLM-5.2

Hacker News •
×

Major AI labs are questioning the endless scaling of model parameters and training data after Claude Fable 5 faced US government restrictions just three days post-release—the first national security AI ban. Despite massive size advantages, proprietary models may be hitting diminishing returns on actual intelligence.

GLM-5.2 from Z.ai (753B parameters, 40B active) scores within 4 points of GPT-5.5 on the Artificial Analysis Intelligence Index, despite being roughly half the size. The MIT-licensed model achieved 28% hallucination rate compared to GPT-5.5's 86%, revealing a stark contrast in truthfulness and accuracy.

Deep Seek V4 Pro illustrates the scaling problem perfectly: at 1.6T parameters it scored 94% on the hallucination benchmark while GLM-5.2 identified technical impossibilities almost instantly. The larger model burned through 10x more reasoning tokens yet produced confidently incorrect responses, demonstrating how massive models fail to recognize logical fallacies.

AI development now faces a trilemma: raw capability, uncertainty calibration, and computational efficiency. Blindly increasing parameter counts produces diminishing returns while actual intelligence plateaus. Model selection must prioritize truthfulness over theoretical performance, especially as these systems become more integrated into critical applications.