まとめ
- MAPE expresses errors as a percentage of actual values.
- Compare MAPE and sMAPE on a sales-forecast example and observe instability near zero.
- Review mitigation strategies when the series contains zeros, negatives, or very small numbers.
1. Definition #
$$ \mathrm{MAPE} = \frac{100}{n} \sum_{i=1}^n \left| \frac{y_i - \hat{y}_i}{y_i} \right| $$
- Represents the mean percentage error relative to the actual value.
- Lower is better.
- Becomes unstable when any \(y_i\) is zero or very close to zero.
2. Computing in Python #
import numpy as np
from sklearn.metrics import mean_absolute_percentage_error
y_true = np.array([120, 150, 80, 200])
y_pred = np.array([110, 160, 75, 210])
mape = mean_absolute_percentage_error(y_true, y_pred)
print(f"MAPE = {mape * 100:.2f}%")
mean_absolute_percentage_error returns a fraction; multiply by 100 for percentage format.
3. sMAPE (symmetrised MAPE) #
To reduce the blow-up at zero, sMAPE includes both actual and predicted values in the denominator:
$$ \mathrm{sMAPE} = \frac{100}{n} \sum_{i=1}^n \frac{|y_i - \hat{y}_i|}{(|y_i| + |\hat{y}_i|)/2} $$
import numpy as np
def smape(y_true: np.ndarray, y_pred: np.ndarray) -> float:
numerator = np.abs(y_true - y_pred)
denominator = (np.abs(y_true) + np.abs(y_pred)) / 2
return float(np.mean(numerator / np.maximum(denominator, 1e-8)))
y_true = np.array([120, 150, 80, 200])
y_pred = np.array([110, 160, 75, 210])
print(f"sMAPE = {smape(y_true, y_pred) * 100:.2f}%")
The denominator can still be zero; add a small epsilon as needed.
4. Points to watch #
- Zeros or negatives: MAPE explodes when actual values hit zero; use sMAPE or handle zero-demand periods separately.
- Bias towards small values: percentage errors weight small-volume items more heavily. Communicate this effect to stakeholders.
- Outliers: relative metrics downplay large-volume errors. Combine with MAE/RMSE for absolute impact.
- Interpretation: business users like “average error is ±X%,” but pair it with monetary impact where possible.
5. Pairing with other metrics #
- MAE / RMSE: report absolute errors alongside percentages to understand real-world loss.
- RMSLE: emphasise underestimation for growth or demand forecasts.
- Pinball loss: quantify upper/lower quantile forecasts when risk tolerances matter.
Summary #
- MAPE delivers intuitive percentage error but needs careful handling near zero.
- sMAPE mitigates instability by normalising with the sum of actual and predicted values.
- Combine relative and absolute metrics to prioritise model improvements and communicate business impact.