HeadlinesBriefing favicon HeadlinesBriefing.com

Choosing the Right Linear Regularizer: Insights from 134k Simulations

Towards Data Science •
×

Researchers Ahsaas Bajaj and Benjamin S. Knight ran 134,400 simulations across 960 configurations to compare Ridge, Lasso, ElasticNet and Post‑Lasso OLS on real‑world Instacart models. They measured test RMSE, F1 score for feature recovery, and coefficient L2 error, varying sample size, feature count, multicollinearity, signal‑to‑noise ratio and sparsity. Their goal: replace intuition with a data‑driven decision guide in practice.

Across the massive benchmark, predictive accuracy showed almost no gap: median RMSE differed by at most 0.3 % among Ridge, Lasso and ElasticNet when the training set exceeded 78 observations per feature. Because Ridge solves each α in closed form, its median runtime was roughly 6 seconds versus 9 seconds for Lasso and 48 seconds for ElasticNet, making it the fastest choice for pure prediction.

When variable selection matters, the study flips the recommendation. In high‑collinearity regimes (condition number above 10⁴) Lasso’s recall collapsed to 0.18, while ElasticNet retained 0.93 of true features, a five‑fold advantage. Conversely, with ample data (n/p ≥ 78) all methods converged, allowing practitioners to default to Ridge for speed. The authors warn against using Post‑Lasso OLS for any objective, as it consistently underperformed.