🐻⬇️🏀

2025-2026 NCAAMD2 Model Performance Analysis

Scope

All scored games in the selected league and season. AP Poll is excluded here.

Comparing prediction accuracy across 1762 games using multiple rating models.

Model Catalog

7-day holdout coverage: 16/17 models .

Rolling Holdout Curves

Each point is a strict weekly holdout: train on all games before that week, test on that week. This first version uses a 21-day warmup, then 7-day holdouts stepped forward weekly.

Log Loss Brier AUC Accuracy

Weekly strict holdout log loss. Lower is better. Showing 16 models across 14 windows. Click legend items to hide/show series.

Recent Window Winners

Holdout Best Log Loss Runner-up Models
Feb 4 - Feb 7 Points Off/Def 0.528 Points Off/Def Recency (0.529) 16
Jan 28 - Feb 3 Core Ensemble 0.570 Recency Ensemble (0.570) 16
Jan 21 - Jan 27 Margin Recency 0.533 Margin (0.534) 16
Jan 14 - Jan 20 Margin 0.551 Points Off/Def (0.557) 16
Jan 7 - Jan 13 Margin 0.569 Points Off/Def (0.574) 16
Dec 31 - Jan 6 Margin 0.549 Core Ensemble (0.552) 16
Dec 24 - Dec 30 Margin 0.237 Points Off/Def (0.240) 16
Dec 17 - Dec 23 Margin 0.569 Points Off/Def (0.572) 16

Model Performance Leaderboard

Models ranked by strict holdout AUC when available (fallback: full-season AUC). Hover over column headers for explanations.

# Model 7d Split AUC Acc Brier LogLoss n AUC 7d Acc 7d Brier 7d n 7d
1 Points Off/Def Recency Points Off/Def Recency Off/def points regression with exponential recency weights. More → STRICT
285g
- - - - 0 0.820 72.6% 0.178 285
2 Points Off/Def Points Off/Def Raw points regression with separate offensive and defensive team parameters. More → STRICT
285g
0.796 72.9% 0.191 0.562 262 0.816 71.6% 0.178 285
3 Margin Margin Linear team-strength model fit on point differential instead of binary wins. More → STRICT
285g
0.802 71.8% 0.184 0.544 262 0.813 73.0% 0.179 285
4 Margin Recency Margin Recency Margin regression with exponential recency weights on newer games. More → STRICT
285g
- - - - 0 0.810 72.6% 0.180 285
5 Adjusted Context Blend Adjusted Context Blend Experimental context-heavy win model blending strong team components with rest and venue context. More → STRICT
285g
- - - - 0 0.810 72.3% 0.180 285
6 Core Ensemble Core Ensemble Equal-logit blend of Elo, recency BT, recency margin, log-adjusted pyth, and points off/def. More → STRICT
285g
- - - - 0 0.798 72.6% 0.183 285
7 Recency Ensemble Recency Ensemble Equal-logit blend of Elo, recency BT, recency margin, log-adjusted pyth, and recency points off/def. More → STRICT
285g
- - - - 0 0.798 73.0% 0.183 285
8 Avg Margin Baseline Avg Margin Baseline Predict from simple average scoring margin in the training window. More → STRICT
285g
0.874 77.9% 0.150 0.458 262 0.780 69.5% 0.191 285
9 Adjusted Efficiency Adjusted Efficiency Opponent-adjusted efficiency model with separate offensive and defensive components. More → STRICT
285g
0.799 71.4% 0.184 0.541 262 0.779 73.0% 0.196 285
10 Log Adjusted Log Adjusted Log-scale adjusted efficiency model that downweights blowout leverage. More → STRICT
285g
0.794 72.5% 0.187 0.548 262 0.779 73.3% 0.196 285
11 Bradley-Terry Bradley-Terry Static logistic paired-comparison model with one team strength parameter. More → STRICT
285g
0.796 70.2% 0.185 0.547 262 0.777 69.8% 0.192 285
12 Dynamic Bradley-Terry Dynamic Bradley-Terry Time-evolving paired-comparison model with latent team strength drift. More → STRICT
285g
- - - - 0 0.776 71.9% 0.192 285
13 Pythagorean Pythagorean Pythagorean win expectation from raw points scored and allowed. More → STRICT
285g
0.787 70.6% 0.195 0.573 262 0.772 72.3% 0.199 285
14 Bradley-Terry Recency Bradley-Terry Recency Static Bradley-Terry with exponential recency weights on newer games. More → STRICT
285g
- - - - 0 0.760 68.4% 0.198 285
15 Elo Elo Streaming paired-comparison rating with recency baked into sequential updates. More → STRICT
285g
0.844 79.0% 0.179 0.540 262 0.760 70.2% 0.197 285
16 Efficiency Efficiency Tempo-adjusted efficiency version of Pythagorean ratings. More → FULL
no 7d
0.638 57.6% 0.331 1.499 224 - - - 0
17 Home Team Baseline Home Team Baseline Always favor the home team with a fixed prior. More → STRICT
285g
0.592 59.2% 0.242 0.676 262 0.635 63.2% 0.234 285

Methodology

ELO / Bradley-Terry

  • ELO: Iterative updates, K=64, HCA=100
  • BT: Static logistic regression on all games
  • Both model win probability, not margin
  • ELO updates after each game; BT fits once

Pythagorean Models

  • Raw: Classic points scored/allowed formula
  • Efficiency: Pace-adjusted (pts per possession)
  • Adjusted: Opponent-adjusted efficiency
  • Log: Log-linear multiplicative scale

Margin Regression

  • Team-level ridge regression on point margin
  • Linear Bradley-Terry (margin target)
  • Alpha=0.05 (CV-tuned)
  • Learns home advantage from data (~6 pts)

Baselines

  • Home Team: Always predict home wins (60%)
  • Avg Margin: Higher average margin wins
  • Models should beat these to add value