まとめ
- Explained variance measures the proportion of variability in the target that the model captures.
- Compute the score alongside R² to highlight how noise and bias affect each metric differently.
- Understand when explained variance is informative and when it can be misleading.
1. Definition #
For variance of the observations \(\mathrm{Var}(y)\) and variance of the residuals \(\mathrm{Var}(y - \hat{y})\):
$$ \mathrm{Explained\ Variance} = 1 - \frac{\mathrm{Var}(y - \hat{y})}{\mathrm{Var}(y)} $$
- Values close to 1 indicate that most variability is captured.
- 0 equals a baseline “predict the mean” model.
- Negative values imply the predictions are worse than using the mean.
2. Computing in Python #
from sklearn.metrics import explained_variance_score
ev = explained_variance_score(y_test, y_pred)
print(f"Explained Variance: {ev:.3f}")
explained_variance_score supports multi-output regression via multioutput="raw_values" if needed.
3. Explained variance vs. R² #
- Sensitivity to bias: R² reacts to bias (systematic error) because it considers mean squared error, while explained variance ignores bias and focuses on variance.
- Use cases: when you care about capturing fluctuation rather than matching levels exactly.
- Reporting: pair the score with R² to provide both variance-explanation and overall fit insights.
4. Practical applications #
- Risk/volatility forecasting: check whether the model captures the variability around a target level.
- Multi-output regression: compare explained variance per output to identify which targets are harder.
- Bias correction needed: high explained variance with large MAE indicates the model matches volatility but is biased; adjust intercepts accordingly.
5. Pairing with other metrics #
| Metric | Focus | Comment |
|---|---|---|
| R² | Overall fit | Sensitive to both bias and variance |
| Explained variance | Variance capture | Insensitive to mean offset |
| MAE / RMSE | Absolute error | Quantifies actual deviation |
Summary #
- Explained variance complements R² by isolating how well the model captures variability.
- Easily computed via
explained_variance_score, including multi-output cases. - Always interpret it together with bias-focused metrics (MAE, MBE) to gain a balanced view.