Model Performance Analysis | NCAAM

2025-2026 NCAAM Model Performance Analysis

Scope

All scored games in the selected league and season. AP Poll is excluded here.

Season

2025-2026 2024-2025 2023-2024 2022-2023 2021-2022 2020-2021 2019-2020 2018-2019 2017-2018 2016-2017 2015-2016 2014-2015 2013-2014 2012-2013 2011-2012 2010-2011 2009-2010 2008-2009 2007-2008 2006-2007 2005-2006 2004-2005 2003-2004

Comparing prediction accuracy across 2975 games using multiple rating models.

Model Catalog

7-day holdout coverage: 16/17 models .

Rolling Holdout Curves

Each point is a strict weekly holdout: train on all games before that week, test on that week. This first version uses a 21-day warmup, then 7-day holdouts stepped forward weekly.

Log Loss Brier AUC Accuracy

Weekly strict holdout log loss. Lower is better. Showing 16 models across 22 windows. Click legend items to hide/show series.

Recent Window Winners

Holdout	Best	Log Loss	Runner-up	Models
Apr 1 - Apr 6	Log Adjusted	0.537	Adjusted Efficiency (0.538)	16
Mar 25 - Mar 31	Log Adjusted	0.600	Adjusted Efficiency (0.601)	16
Mar 18 - Mar 24	Adjusted Efficiency	0.447	Log Adjusted (0.448)	16
Mar 11 - Mar 17	Margin Recency	0.599	Margin (0.604)	16
Mar 4 - Mar 10	Core Ensemble	0.587	Recency Ensemble (0.587)	16
Feb 25 - Mar 3	Points Off/Def Recency	0.578	Margin Recency (0.582)	16
Feb 18 - Feb 24	Margin	0.590	Points Off/Def (0.590)	16
Feb 11 - Feb 17	Recency Ensemble	0.614	Core Ensemble (0.615)	16

Model Performance Leaderboard

Models ranked by strict holdout AUC when available (fallback: full-season AUC). Hover over column headers for explanations.

#	Model	7d Split	AUC	Acc	Brier	LogLoss	n	AUC 7d	Acc 7d	Brier 7d	n 7d
1	Adjusted Context Blend Adjusted Context Blend Experimental context-heavy win model blending strong team components with rest and venue context. More →	STRICT 13g	-	-	-	-	0	0.925	76.9%	0.180	13
2	Adjusted Efficiency Adjusted Efficiency Opponent-adjusted efficiency model with separate offensive and defensive components. More →	STRICT 13g	0.815	73.1%	0.179	0.529	5361	0.825	76.9%	0.179	13
3	Log Adjusted Log Adjusted Log-scale adjusted efficiency model that downweights blowout leverage. More →	STRICT 13g	0.814	73.2%	0.179	0.530	5361	0.825	76.9%	0.178	13
4	Points Off/Def Points Off/Def Raw points regression with separate offensive and defensive team parameters. More →	STRICT 13g	0.812	73.3%	0.182	0.541	5361	0.825	69.2%	0.196	13
5	Margin Margin Linear team-strength model fit on point differential instead of binary wins. More →	STRICT 13g	0.822	73.4%	0.178	0.531	5361	0.800	69.2%	0.195	13
6	Efficiency Efficiency Tempo-adjusted efficiency version of Pythagorean ratings. More →	FULL no 7d	0.796	71.6%	0.191	0.571	5361	-	-	-	0
7	Margin Recency Margin Recency Margin regression with exponential recency weights on newer games. More →	STRICT 13g	-	-	-	-	0	0.750	53.8%	0.225	13
8	Core Ensemble Core Ensemble Equal-logit blend of Elo, recency BT, recency margin, log-adjusted pyth, and points off/def. More →	STRICT 13g	-	-	-	-	0	0.750	69.2%	0.215	13
9	Points Off/Def Recency Points Off/Def Recency Off/def points regression with exponential recency weights. More →	STRICT 13g	-	-	-	-	0	0.675	61.5%	0.227	13
10	Recency Ensemble Recency Ensemble Equal-logit blend of Elo, recency BT, recency margin, log-adjusted pyth, and recency points off/def. More →	STRICT 13g	-	-	-	-	0	0.650	61.5%	0.223	13
11	Home Team Baseline Home Team Baseline Always favor the home team with a fixed prior. More →	STRICT 13g	0.638	64.0%	0.232	0.657	5361	0.650	61.5%	0.237	13
12	Bradley-Terry Bradley-Terry Static logistic paired-comparison model with one team strength parameter. More →	STRICT 13g	0.823	73.8%	0.175	0.523	5361	0.625	61.5%	0.228	13
13	Dynamic Bradley-Terry Dynamic Bradley-Terry Time-evolving paired-comparison model with latent team strength drift. More →	STRICT 13g	-	-	-	-	0	0.625	61.5%	0.248	13
14	Pythagorean Pythagorean Pythagorean win expectation from raw points scored and allowed. More →	STRICT 13g	0.752	69.8%	0.208	0.603	5361	0.600	61.5%	0.235	13
15	Avg Margin Baseline Avg Margin Baseline Predict from simple average scoring margin in the training window. More →	STRICT 13g	0.784	70.7%	0.195	0.573	5361	0.550	69.2%	0.239	13
16	Elo Elo Streaming paired-comparison rating with recency baked into sequential updates. More →	STRICT 13g	0.783	70.5%	0.191	0.564	5361	0.500	53.8%	0.263	13
17	Bradley-Terry Recency Bradley-Terry Recency Static Bradley-Terry with exponential recency weights on newer games. More →	STRICT 13g	-	-	-	-	0	0.400	38.5%	0.266	13

Methodology

ELO / Bradley-Terry

ELO: Iterative updates, K=64, HCA=100
BT: Static logistic regression on all games
Both model win probability, not margin
ELO updates after each game; BT fits once

Pythagorean Models

Raw: Classic points scored/allowed formula
Efficiency: Pace-adjusted (pts per possession)
Adjusted: Opponent-adjusted efficiency
Log: Log-linear multiplicative scale

Margin Regression

Team-level ridge regression on point margin
Linear Bradley-Terry (margin target)
Alpha=0.05 (CV-tuned)
Learns home advantage from data (~6 pts)

Baselines

Home Team: Always predict home wins (60%)
Avg Margin: Higher average margin wins
Models should beat these to add value