Methodology: The Mathematics of Pythagorean Expectation
The Mathematics of Pythagorean Expectation
Abstract
Pythagorean Expectation treats a team's win probability as a function of its aggregate scoring performance. The formula can be rigorously derived assuming points scored ($PS$) and points allowed ($PA$) follow independent Weibull distributions.
We present the mathematical framework for our two baseline models: 1. Pythagorean Raw: The classical formulation mapping observed point totals to expected win percentage. 2. Pythagorean Adjusted: A regularized linear model (Ridge Regression) that extracts strength-adjusted efficiencies from game-level data.
1. The Classical (Raw) Model
Derivation
Let $X$ and $Y$ be random variables representing points scored by Team A and Team B (Opponent) in a game. If $X$ and $Y$ are independent and follow Weibull distributions:
$$ f(x; \lambda, k) = \frac{k}{\lambda} \left(\frac{x}{\lambda}\right)^{k-1} e^{-(x/\lambda)^k} $$
It can be shown (Miller, 2006) that the probability that $X > Y$ (a win) converges to the Pythagorean form:
$$ P(X > Y) = \frac{\lambda_x^k}{\lambda_x^k + \lambda_y^k} $$
In our application, we estimate $\lambda$ using the observed season totals $PS$ and $PA$.
The Formula
For NCAA Men's Basketball, we use an empirically derived exponent $\gamma = 11.5$:
$$ \text{Win}\% = \frac{PS^{\gamma}}{PS^{\gamma} + PA^{\gamma}} = \frac{1}{1 + \left(\frac{PA}{PS}\right)^{\gamma}} $$
This model serves as our unadjusted baseline. It strictly measures dominance over the observed schedule, without accounting for opponent strength.
2. The Adjusted Efficiency Model
To account for Strength of Schedule (SOS), we cannot rely on raw averages. A team might score 1.2 points per possession (PPP) because they are elite, or because their opponent has a poor defense.
We model game-level efficiency as a system of linear equations and solve it using Ridge Regression.
Linear Formulation
Let $y_{g}$ be the offensive efficiency (points per 100 possessions) for the home team in game $g$. We posit:
$$ y_{g} = \alpha + \theta_{i}^{Off} - \theta_{j}^{Def} + \epsilon_{g} $$
Where: * $\alpha$: League average efficiency (intercept). * $\theta_{i}^{Off}$: Offensive rating for Team $i$ (above/below average). * $\theta_{j}^{Def}$: Defensive rating for Opponent $j$ (above/below average). * $\epsilon_{g}$: Residual noise (variance unexplained by team strength).
We construct a sparse design matrix $X$ of shape $(2 \cdot N_{games}, 2 \cdot N_{teams})$. For each game, we have two observations: 1. Team $i$ Offense vs Team $j$ Defense. 2. Team $j$ Offense vs Team $i$ Defense.
Optimization: Ridge Regression
This system is often ill-posed or collinear (especially in unconnected graphs or short seasons). To solve it robustly, we apply Tikhonov Regularization (Ridge Regression).
We minimize the cost function $J(\theta)$:
$$ J(\theta) = \sum_{g=1}^{N} (y_g - (\alpha + \theta_{i}^{Off} - \theta_{j}^{Def}))^2 + \lambda \sum_{k} \theta_k^2 $$
Where $\lambda$ is the regularization penalty.
- OLS Term: $\sum (y - \hat{y})^2$ ensures the ratings explain the observed scores.
- Ridge Term: $\lambda \sum \theta^2$ shrinks coefficients towards zero (the league average), preventing overfitting on small samples.
Solution
In matrix notation, let $\beta = [\alpha, \theta^{Off}, \theta^{Def}]^T$. The Ridge estimator is:
$$ \hat{\beta}_{ridge} = (X^T X + \lambda I)^{-1} X^T y $$
Computationally, we solve this using scipy.sparse.linalg.lsqr, which is efficient for our large, sparse operator.
Final Calculation
Once we obtain the optimal coefficients $\hat{\theta}^{Off}$ and $\hat{\theta}^{Def}$, we reconstruct the Adjusted Efficiency for each team:
$$ \text{AdjO} = \alpha + \hat{\theta}^{Off} $$ $$ \text{AdjD} = \alpha - \hat{\theta}^{Def} $$
These values represent how a team would perform against a theoretically "average" opponent. We then feed these adjusted efficiencies back into the Pythagorean formula to generate the final Adjusted Win%:
$$ \text{AdjWin}\% = \frac{(\text{AdjO})^{\gamma}}{(\text{AdjO})^{\gamma} + (\text{AdjD})^{\gamma}} $$
This metric, often called "PythAdj", is widely considered the gold standard for predictive team ranking.