HeadlinesBriefing favicon HeadlinesBriefing.com

Logistic Regression Beats XGBoost in Football Match Prediction Experiment

Towards Data Science •
×

When pitted against four other classifiers on predicting international football match outcomes, logistic regression delivered the best performance with a cross-validated log-loss of 1.001. The experiment tested five models on 358 matches from World Cups and Euros, using just three features: team strength gap, combined strength, and knockout status. XGBoost, typically the Kaggle champion, surprisingly finished last with a log-loss of 1.169.

The author chose log-loss over accuracy because it evaluates the entire probability vector rather than just the top prediction. This matters enormously for forecasting models that output calibrated probabilities. The baseline log-loss sits at ln(3) ≈ 1.099 — what you'd get predicting uniform 1/3 probabilities across all three outcomes. Any model scoring above this threshold performs worse than random guessing. XGBoost's 48% accuracy masked its fundamental problem: confident miscalibration on a small dataset.

The bias-variance tradeoff explains why simpler won. Football outcomes contain huge irreducible noise — a deflected shot decides knockout ties. High-capacity models like XGBoost have thousands of effective parameters but only roughly 120 matches per class to constrain them. They overfit to noise, making confident but wrong predictions that incur heavy penalties under log-loss's convex structure.

Meanwhile, logistic regression's inductive bias matches the data-generating process. Team strength predicts match outcomes through smooth, monotonic relationships — exactly what logistic regression assumes. With only three features and weak interactions, there's nothing for complex models to discover, so they add variance without signal. The lesson: model complexity should scale with data availability, not competition reputation.