2025-2026 NCAAMD2 Model Performance Analysis
All scored games in the selected league and season. AP Poll is excluded here.
Comparing prediction accuracy across 1762 games using multiple rating models.
7-day holdout coverage: 16/17 models .
Rolling Holdout Curves
Each point is a strict weekly holdout: train on all games before that week, test on that week. This first version uses a 21-day warmup, then 7-day holdouts stepped forward weekly.
Weekly strict holdout log loss. Lower is better. Showing 16 models across 14 windows. Click legend items to hide/show series.
Recent Window Winners
| Holdout | Best | Log Loss | Runner-up | Models |
|---|---|---|---|---|
| Feb 4 - Feb 7 | Points Off/Def | 0.528 | Points Off/Def Recency (0.529) | 16 |
| Jan 28 - Feb 3 | Core Ensemble | 0.570 | Recency Ensemble (0.570) | 16 |
| Jan 21 - Jan 27 | Margin Recency | 0.533 | Margin (0.534) | 16 |
| Jan 14 - Jan 20 | Margin | 0.551 | Points Off/Def (0.557) | 16 |
| Jan 7 - Jan 13 | Margin | 0.569 | Points Off/Def (0.574) | 16 |
| Dec 31 - Jan 6 | Margin | 0.549 | Core Ensemble (0.552) | 16 |
| Dec 24 - Dec 30 | Margin | 0.237 | Points Off/Def (0.240) | 16 |
| Dec 17 - Dec 23 | Margin | 0.569 | Points Off/Def (0.572) | 16 |
Model Performance Leaderboard
Models ranked by strict holdout AUC when available (fallback: full-season AUC). Hover over column headers for explanations.
| # | Model | 7d Split | AUC | Acc | Brier | LogLoss | n | AUC 7d | Acc 7d | Brier 7d | n 7d |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Points Off/Def Recency Points Off/Def Recency Off/def points regression with exponential recency weights. More → |
STRICT
285g
|
- | - | - | - | 0 | 0.820 | 72.6% | 0.178 | 285 |
| 2 | Points Off/Def Points Off/Def Raw points regression with separate offensive and defensive team parameters. More → |
STRICT
285g
|
0.796 | 72.9% | 0.191 | 0.562 | 262 | 0.816 | 71.6% | 0.178 | 285 |
| 3 | Margin Margin Linear team-strength model fit on point differential instead of binary wins. More → |
STRICT
285g
|
0.802 | 71.8% | 0.184 | 0.544 | 262 | 0.813 | 73.0% | 0.179 | 285 |
| 4 | Margin Recency Margin Recency Margin regression with exponential recency weights on newer games. More → |
STRICT
285g
|
- | - | - | - | 0 | 0.810 | 72.6% | 0.180 | 285 |
| 5 | Adjusted Context Blend Adjusted Context Blend Experimental context-heavy win model blending strong team components with rest and venue context. More → |
STRICT
285g
|
- | - | - | - | 0 | 0.810 | 72.3% | 0.180 | 285 |
| 6 | Core Ensemble Core Ensemble Equal-logit blend of Elo, recency BT, recency margin, log-adjusted pyth, and points off/def. More → |
STRICT
285g
|
- | - | - | - | 0 | 0.798 | 72.6% | 0.183 | 285 |
| 7 | Recency Ensemble Recency Ensemble Equal-logit blend of Elo, recency BT, recency margin, log-adjusted pyth, and recency points off/def. More → |
STRICT
285g
|
- | - | - | - | 0 | 0.798 | 73.0% | 0.183 | 285 |
| 8 | Avg Margin Baseline Avg Margin Baseline Predict from simple average scoring margin in the training window. More → |
STRICT
285g
|
0.874 | 77.9% | 0.150 | 0.458 | 262 | 0.780 | 69.5% | 0.191 | 285 |
| 9 | Adjusted Efficiency Adjusted Efficiency Opponent-adjusted efficiency model with separate offensive and defensive components. More → |
STRICT
285g
|
0.799 | 71.4% | 0.184 | 0.541 | 262 | 0.779 | 73.0% | 0.196 | 285 |
| 10 | Log Adjusted Log Adjusted Log-scale adjusted efficiency model that downweights blowout leverage. More → |
STRICT
285g
|
0.794 | 72.5% | 0.187 | 0.548 | 262 | 0.779 | 73.3% | 0.196 | 285 |
| 11 | Bradley-Terry Bradley-Terry Static logistic paired-comparison model with one team strength parameter. More → |
STRICT
285g
|
0.796 | 70.2% | 0.185 | 0.547 | 262 | 0.777 | 69.8% | 0.192 | 285 |
| 12 | Dynamic Bradley-Terry Dynamic Bradley-Terry Time-evolving paired-comparison model with latent team strength drift. More → |
STRICT
285g
|
- | - | - | - | 0 | 0.776 | 71.9% | 0.192 | 285 |
| 13 | Pythagorean Pythagorean Pythagorean win expectation from raw points scored and allowed. More → |
STRICT
285g
|
0.787 | 70.6% | 0.195 | 0.573 | 262 | 0.772 | 72.3% | 0.199 | 285 |
| 14 | Bradley-Terry Recency Bradley-Terry Recency Static Bradley-Terry with exponential recency weights on newer games. More → |
STRICT
285g
|
- | - | - | - | 0 | 0.760 | 68.4% | 0.198 | 285 |
| 15 | Elo Elo Streaming paired-comparison rating with recency baked into sequential updates. More → |
STRICT
285g
|
0.844 | 79.0% | 0.179 | 0.540 | 262 | 0.760 | 70.2% | 0.197 | 285 |
| 16 | Efficiency Efficiency Tempo-adjusted efficiency version of Pythagorean ratings. More → |
FULL
no 7d
|
0.638 | 57.6% | 0.331 | 1.499 | 224 | - | - | - | 0 |
| 17 | Home Team Baseline Home Team Baseline Always favor the home team with a fixed prior. More → |
STRICT
285g
|
0.592 | 59.2% | 0.242 | 0.676 | 262 | 0.635 | 63.2% | 0.234 | 285 |
Methodology
ELO / Bradley-Terry
- ELO: Iterative updates, K=64, HCA=100
- BT: Static logistic regression on all games
- Both model win probability, not margin
- ELO updates after each game; BT fits once
Pythagorean Models
- Raw: Classic points scored/allowed formula
- Efficiency: Pace-adjusted (pts per possession)
- Adjusted: Opponent-adjusted efficiency
- Log: Log-linear multiplicative scale
Margin Regression
- Team-level ridge regression on point margin
- Linear Bradley-Terry (margin target)
- Alpha=0.05 (CV-tuned)
- Learns home advantage from data (~6 pts)
Baselines
- Home Team: Always predict home wins (60%)
- Avg Margin: Higher average margin wins
- Models should beat these to add value