PICP (Prediction Interval Coverage Probability)

Last updated 2020-06-17 Read time 2 min

Summary

PICP measures the percentage of observations that fall inside the predicted interval for a chosen confidence level.
Generate upper/lower bounds, compute PICP and the corresponding interval width, and diagnose calibration.
Combine PICP with PINAW or MIS when evaluating interval forecasts in demand, energy, or risk models.

1. Definition #

For lower bounds $L_i$, upper bounds $U_i$, observations $y_i$, and target coverage $\gamma$:

$$ \mathrm{PICP} = \frac{1}{n} \sum_{i=1}^n \mathbf{1}\{ L_i \le y_i \le U_i \} $$

Compare the empirical PICP with the desired $\gamma$ (e.g. 0.9) to check if intervals are calibrated.

2. Computing in Python #

1
2
3
4
5
6
7
8
9
import numpy as np

def picp(y_true: np.ndarray, lower: np.ndarray, upper: np.ndarray) -> float:
    """Prediction interval coverage probability."""
    inside = (y_true >= lower) & (y_true <= upper)
    return float(inside.mean())

coverage = picp(y_test, lower_bound, upper_bound)
print(f"PICP: {coverage:.3f}")

lower_bound and upper_bound come from a predictive interval model (e.g. quantile regression, NGBoost, conformal prediction).

3. Interpreting coverage #

PICP ≈ target → intervals are well calibrated.
PICP < target → intervals too narrow; increase quantiles or adjust variance.
PICP > target → intervals too wide; may be overly conservative.

Always inspect interval width as well; very wide intervals can trivially hit PICP.

4. Pairing with PINAW #

Normalised average width:

$$ \mathrm{PINAW} = \frac{1}{nR} \sum_{i=1}^n (U_i - L_i) $$

where $R$ is the range of the data. Together, PICP + PINAW show both coverage and tightness.

5. Applications #

Inventory & demand planning: ensure 90% intervals achieve desired service levels.
Energy forecasting: evaluate confidence bands for load balancing.
Financial risk: analogous to backtesting Value at Risk (VaR) at a chosen confidence.

Summary #

PICP verifies that prediction intervals meet the intended reliability level.
Pair it with interval width metrics (PINAW) or pinball loss to balance calibration and sharpness.
Monitor PICP regularly for models that output intervals to maintain trust in their forecasts.