🐻⬇️🏀

2025-2026 NBA Model Performance Analysis

Scope

All scored games in the selected league and season. AP Poll is excluded here.

Comparing prediction accuracy across 1257 games using multiple rating models.

Model Catalog

7-day holdout coverage: 16/17 models .

Rolling Holdout Curves

Each point is a strict weekly holdout: train on all games before that week, test on that week. This first version uses a 21-day warmup, then 7-day holdouts stepped forward weekly.

Log Loss Brier AUC Accuracy

Weekly strict holdout log loss. Lower is better. Showing 16 models across 26 windows. Click legend items to hide/show series.

Recent Window Winners

Holdout Best Log Loss Runner-up Models
Apr 15 - Apr 21 Home Team Baseline 0.646 Bradley-Terry Recency (0.709) 16
Apr 8 - Apr 14 Elo 0.522 Dynamic Bradley-Terry (0.529) 16
Apr 1 - Apr 7 Elo 0.479 Adjusted Context Blend (0.493) 16
Mar 25 - Mar 31 Dynamic Bradley-Terry 0.510 Adjusted Context Blend (0.510) 16
Mar 18 - Mar 24 Elo 0.475 Recency Ensemble (0.502) 16
Mar 11 - Mar 17 Adjusted Context Blend 0.560 Dynamic Bradley-Terry (0.562) 16
Mar 4 - Mar 10 Adjusted Efficiency 0.590 Log Adjusted (0.590) 16
Feb 25 - Mar 3 Log Adjusted 0.495 Adjusted Efficiency (0.495) 16

Model Performance Leaderboard

Models ranked by strict holdout AUC when available (fallback: full-season AUC). Hover over column headers for explanations.

# Model 7d Split AUC Acc Brier LogLoss n AUC 7d Acc 7d Brier 7d n 7d
1 Efficiency Efficiency Tempo-adjusted efficiency version of Pythagorean ratings. More → FULL
no 7d
0.752 68.1% 0.203 0.591 1106 - - - 0
2 Home Team Baseline Home Team Baseline Always favor the home team with a fixed prior. More → STRICT
18g
0.550 55.0% 0.250 0.693 1106 0.662 66.7% 0.227 18
3 Elo Elo Streaming paired-comparison rating with recency baked into sequential updates. More → STRICT
18g
0.743 68.5% 0.206 0.600 1106 0.588 55.6% 0.273 18
4 Adjusted Context Blend Adjusted Context Blend Experimental context-heavy win model blending strong team components with rest and venue context. More → STRICT
18g
- - - - 0 0.562 50.0% 0.263 18
5 Dynamic Bradley-Terry Dynamic Bradley-Terry Time-evolving paired-comparison model with latent team strength drift. More → STRICT
18g
- - - - 0 0.537 61.1% 0.257 18
6 Bradley-Terry Recency Bradley-Terry Recency Static Bradley-Terry with exponential recency weights on newer games. More → STRICT
18g
- - - - 0 0.500 61.1% 0.258 18
7 Core Ensemble Core Ensemble Equal-logit blend of Elo, recency BT, recency margin, log-adjusted pyth, and points off/def. More → STRICT
18g
- - - - 0 0.500 44.4% 0.273 18
8 Recency Ensemble Recency Ensemble Equal-logit blend of Elo, recency BT, recency margin, log-adjusted pyth, and recency points off/def. More → STRICT
18g
- - - - 0 0.500 50.0% 0.275 18
9 Margin Margin Linear team-strength model fit on point differential instead of binary wins. More → STRICT
18g
0.743 67.7% 0.211 0.611 1106 0.463 44.4% 0.268 18
10 Margin Recency Margin Recency Margin regression with exponential recency weights on newer games. More → STRICT
18g
- - - - 0 0.463 38.9% 0.285 18
11 Points Off/Def Points Off/Def Raw points regression with separate offensive and defensive team parameters. More → STRICT
18g
0.751 67.5% 0.208 0.605 1106 0.463 44.4% 0.268 18
12 Points Off/Def Recency Points Off/Def Recency Off/def points regression with exponential recency weights. More → STRICT
18g
- - - - 0 0.463 38.9% 0.283 18
13 Adjusted Efficiency Adjusted Efficiency Opponent-adjusted efficiency model with separate offensive and defensive components. More → STRICT
18g
0.749 67.9% 0.204 0.594 1106 0.438 44.4% 0.295 18
14 Log Adjusted Log Adjusted Log-scale adjusted efficiency model that downweights blowout leverage. More → STRICT
18g
0.749 67.9% 0.204 0.594 1106 0.438 44.4% 0.295 18
15 Pythagorean Pythagorean Pythagorean win expectation from raw points scored and allowed. More → STRICT
18g
0.751 68.4% 0.214 0.620 1106 0.425 44.4% 0.264 18
16 Avg Margin Baseline Avg Margin Baseline Predict from simple average scoring margin in the training window. More → STRICT
18g
0.763 69.6% 0.201 0.589 1106 0.425 44.4% 0.280 18
17 Bradley-Terry Bradley-Terry Static logistic paired-comparison model with one team strength parameter. More → STRICT
18g
0.748 68.2% 0.205 0.598 1106 0.412 38.9% 0.262 18

Methodology

ELO / Bradley-Terry

  • ELO: Iterative updates, K=64, HCA=100
  • BT: Static logistic regression on all games
  • Both model win probability, not margin
  • ELO updates after each game; BT fits once

Pythagorean Models

  • Raw: Classic points scored/allowed formula
  • Efficiency: Pace-adjusted (pts per possession)
  • Adjusted: Opponent-adjusted efficiency
  • Log: Log-linear multiplicative scale

Margin Regression

  • Team-level ridge regression on point margin
  • Linear Bradley-Terry (margin target)
  • Alpha=0.05 (CV-tuned)
  • Learns home advantage from data (~6 pts)

Baselines

  • Home Team: Always predict home wins (60%)
  • Avg Margin: Higher average margin wins
  • Models should beat these to add value