PDP / ICE

最終更新 2026-03-03 読了時間 2 分

まとめ

PDP（Partial Dependence Plot）は特定の特徴量が予測に与える平均的な影響を可視化する。
ICE（Individual Conditional Expectation）は個々のサンプルごとの影響を描画し、交互作用や不均一性を発見する。
scikit-learn の PartialDependenceDisplay で主要モデルに簡単に適用可能。

決定木やアンサンブルモデル（Random Forest、XGBoost など）の基本を理解していると効果的です

直感 #

「年収が100万円増えたら、住宅ローンの審査通過率はどれくらい上がるか？」—この問いに答えるのが PDP。他のすべての特徴量を固定（平均化）した上で、注目する1つの特徴量だけを動かしたときの予測の変化を追う。ICE は平均を取らず全サンプルの軌跡を描くので、サブグループごとの違いも見える。

詳細な解説 #

ライブラリとデータ #

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import fetch_california_housing
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.inspection import PartialDependenceDisplay
from sklearn.model_selection import train_test_split

data = fetch_california_housing()
X_train, X_test, y_train, y_test = train_test_split(
    data.data, data.target, test_size=0.2, random_state=42
)
feature_names = data.feature_names

モデル学習 #

1
2
3
4
model = GradientBoostingRegressor(
    n_estimators=200, max_depth=4, learning_rate=0.1, random_state=42
)
model.fit(X_train, y_train)

PDP の描画 #

1
2
3
4
5
6
7
fig, ax = plt.subplots(1, 3, figsize=(15, 4))
PartialDependenceDisplay.from_estimator(
    model, X_train, features=["MedInc", "AveRooms", "HouseAge"],
    feature_names=feature_names, ax=ax, grid_resolution=50
)
plt.tight_layout()
plt.show()

ICE プロットの描画 #

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
fig, ax = plt.subplots(1, 2, figsize=(12, 4))
PartialDependenceDisplay.from_estimator(
    model, X_train, features=["MedInc", "AveRooms"],
    feature_names=feature_names, ax=ax,
    kind="both",  # PDP + ICE を重ねて描画
    ice_lines_kw={"color": "steelblue", "alpha": 0.1, "linewidth": 0.5},
    pd_line_kw={"color": "red", "linewidth": 2},
    grid_resolution=50
)
plt.tight_layout()
plt.show()

2変数の相互作用 #

1
2
3
4
5
6
fig, ax = plt.subplots(figsize=(6, 5))
PartialDependenceDisplay.from_estimator(
    model, X_train, features=[("MedInc", "HouseAge")],
    feature_names=feature_names, ax=ax, grid_resolution=30
)
plt.show()

PDP vs ICE vs SHAP #

手法	可視化対象	交互作用の検出	計算コスト
PDP	平均的な限界効果	2変数まで	中
ICE	個別の限界効果	サブグループの違い	中
SHAP	特徴量の貢献度分配	Shapley interaction	高

SHAP — Shapley 値ベースの解釈手法
検証曲線 — ハイパーパラメータ軸の診断
ラーニングカーブ — データ量軸の診断