まとめ
- PICP measures the percentage of observations that fall inside the predicted interval for a chosen confidence level.
- Generate upper/lower bounds, compute PICP and the corresponding interval width, and diagnose calibration.
- Combine PICP with PINAW or MIS when evaluating interval forecasts in demand, energy, or risk models.
1. Definition #
For lower bounds \(L_i\), upper bounds \(U_i\), observations \(y_i\), and target coverage \(\gamma\):
$$ \mathrm{PICP} = \frac{1}{n} \sum_{i=1}^n \mathbf{1}{ L_i \le y_i \le U_i } $$
Compare the empirical PICP with the desired \(\gamma\) (e.g. 0.9) to check if intervals are calibrated.
2. Computing in Python #
import numpy as np
def picp(y_true: np.ndarray, lower: np.ndarray, upper: np.ndarray) -> float:
"""Prediction interval coverage probability."""
inside = (y_true >= lower) & (y_true <= upper)
return float(inside.mean())
coverage = picp(y_test, lower_bound, upper_bound)
print(f"PICP: {coverage:.3f}")
lower_bound and upper_bound come from a predictive interval model (e.g. quantile regression, NGBoost, conformal prediction).
3. Interpreting coverage #
- PICP ≈ target → intervals are well calibrated.
- PICP < target → intervals too narrow; increase quantiles or adjust variance.
- PICP > target → intervals too wide; may be overly conservative.
Always inspect interval width as well; very wide intervals can trivially hit PICP.
4. Pairing with PINAW #
Normalised average width:
$$ \mathrm{PINAW} = \frac{1}{nR} \sum_{i=1}^n (U_i - L_i) $$
where \(R\) is the range of the data. Together, PICP + PINAW show both coverage and tightness.
5. Applications #
- Inventory & demand planning: ensure 90% intervals achieve desired service levels.
- Energy forecasting: evaluate confidence bands for load balancing.
- Financial risk: analogous to backtesting Value at Risk (VaR) at a chosen confidence.
Summary #
- PICP verifies that prediction intervals meet the intended reliability level.
- Pair it with interval width metrics (PINAW) or pinball loss to balance calibration and sharpness.
- Monitor PICP regularly for models that output intervals to maintain trust in their forecasts.