🐻⬇️🏀

2025-2026 NCAAM Model Performance Analysis

Scope

All scored games in the selected league and season. AP Poll is excluded here.

Comparing prediction accuracy across 2975 games using multiple rating models.

Model Catalog

7-day holdout coverage: 16/17 models .

Rolling Holdout Curves

Each point is a strict weekly holdout: train on all games before that week, test on that week. This first version uses a 21-day warmup, then 7-day holdouts stepped forward weekly.

Log Loss Brier AUC Accuracy

Weekly strict holdout log loss. Lower is better. Showing 16 models across 22 windows. Click legend items to hide/show series.

Recent Window Winners

Holdout Best Log Loss Runner-up Models
Apr 1 - Apr 6 Log Adjusted 0.537 Adjusted Efficiency (0.538) 16
Mar 25 - Mar 31 Log Adjusted 0.600 Adjusted Efficiency (0.601) 16
Mar 18 - Mar 24 Adjusted Efficiency 0.447 Log Adjusted (0.448) 16
Mar 11 - Mar 17 Margin Recency 0.599 Margin (0.604) 16
Mar 4 - Mar 10 Core Ensemble 0.587 Recency Ensemble (0.587) 16
Feb 25 - Mar 3 Points Off/Def Recency 0.578 Margin Recency (0.582) 16
Feb 18 - Feb 24 Margin 0.590 Points Off/Def (0.590) 16
Feb 11 - Feb 17 Recency Ensemble 0.614 Core Ensemble (0.615) 16

Model Performance Leaderboard

Models ranked by strict holdout AUC when available (fallback: full-season AUC). Hover over column headers for explanations.

# Model 7d Split AUC Acc Brier LogLoss n AUC 7d Acc 7d Brier 7d n 7d
1 Adjusted Context Blend Adjusted Context Blend Experimental context-heavy win model blending strong team components with rest and venue context. More → STRICT
13g
- - - - 0 0.925 76.9% 0.180 13
2 Adjusted Efficiency Adjusted Efficiency Opponent-adjusted efficiency model with separate offensive and defensive components. More → STRICT
13g
0.815 73.1% 0.179 0.529 5361 0.825 76.9% 0.179 13
3 Log Adjusted Log Adjusted Log-scale adjusted efficiency model that downweights blowout leverage. More → STRICT
13g
0.814 73.2% 0.179 0.530 5361 0.825 76.9% 0.178 13
4 Points Off/Def Points Off/Def Raw points regression with separate offensive and defensive team parameters. More → STRICT
13g
0.812 73.3% 0.182 0.541 5361 0.825 69.2% 0.196 13
5 Margin Margin Linear team-strength model fit on point differential instead of binary wins. More → STRICT
13g
0.822 73.4% 0.178 0.531 5361 0.800 69.2% 0.195 13
6 Efficiency Efficiency Tempo-adjusted efficiency version of Pythagorean ratings. More → FULL
no 7d
0.796 71.6% 0.191 0.571 5361 - - - 0
7 Margin Recency Margin Recency Margin regression with exponential recency weights on newer games. More → STRICT
13g
- - - - 0 0.750 53.8% 0.225 13
8 Core Ensemble Core Ensemble Equal-logit blend of Elo, recency BT, recency margin, log-adjusted pyth, and points off/def. More → STRICT
13g
- - - - 0 0.750 69.2% 0.215 13
9 Points Off/Def Recency Points Off/Def Recency Off/def points regression with exponential recency weights. More → STRICT
13g
- - - - 0 0.675 61.5% 0.227 13
10 Recency Ensemble Recency Ensemble Equal-logit blend of Elo, recency BT, recency margin, log-adjusted pyth, and recency points off/def. More → STRICT
13g
- - - - 0 0.650 61.5% 0.223 13
11 Home Team Baseline Home Team Baseline Always favor the home team with a fixed prior. More → STRICT
13g
0.638 64.0% 0.232 0.657 5361 0.650 61.5% 0.237 13
12 Bradley-Terry Bradley-Terry Static logistic paired-comparison model with one team strength parameter. More → STRICT
13g
0.823 73.8% 0.175 0.523 5361 0.625 61.5% 0.228 13
13 Dynamic Bradley-Terry Dynamic Bradley-Terry Time-evolving paired-comparison model with latent team strength drift. More → STRICT
13g
- - - - 0 0.625 61.5% 0.248 13
14 Pythagorean Pythagorean Pythagorean win expectation from raw points scored and allowed. More → STRICT
13g
0.752 69.8% 0.208 0.603 5361 0.600 61.5% 0.235 13
15 Avg Margin Baseline Avg Margin Baseline Predict from simple average scoring margin in the training window. More → STRICT
13g
0.784 70.7% 0.195 0.573 5361 0.550 69.2% 0.239 13
16 Elo Elo Streaming paired-comparison rating with recency baked into sequential updates. More → STRICT
13g
0.783 70.5% 0.191 0.564 5361 0.500 53.8% 0.263 13
17 Bradley-Terry Recency Bradley-Terry Recency Static Bradley-Terry with exponential recency weights on newer games. More → STRICT
13g
- - - - 0 0.400 38.5% 0.266 13

Methodology

ELO / Bradley-Terry

  • ELO: Iterative updates, K=64, HCA=100
  • BT: Static logistic regression on all games
  • Both model win probability, not margin
  • ELO updates after each game; BT fits once

Pythagorean Models

  • Raw: Classic points scored/allowed formula
  • Efficiency: Pace-adjusted (pts per possession)
  • Adjusted: Opponent-adjusted efficiency
  • Log: Log-linear multiplicative scale

Margin Regression

  • Team-level ridge regression on point margin
  • Linear Bradley-Terry (margin target)
  • Alpha=0.05 (CV-tuned)
  • Learns home advantage from data (~6 pts)

Baselines

  • Home Team: Always predict home wins (60%)
  • Avg Margin: Higher average margin wins
  • Models should beat these to add value