ROC-AUC | 閾値設計とモデル比較を支える指標

まとめ

ROC-AUC は閾値に依存せず分類器の識別性能を測る面積指標です。
ロジスティック回帰で ROC 曲線と AUC を描き、ランダム分類との違いを確認します。
閾値設計やコスト調整に活かす際の読み方と注意点を整理します。

1. ROC 曲線と AUC の定義 #

ROC 曲線は False Positive Rate (FPR) を横軸、True Positive Rate (TPR) を縦軸に取った曲線で、分類器の閾値を 0〜1 の範囲で動かして得られます。AUC（Area Under the Curve）はこの曲線の面積を 0〜1 の数値で表します。

AUC = 1.0: 完全に識別できている理想状態
AUC = 0.5: 完全なランダム予測（対角線上）
AUC < 0.5: 予測が逆向きの可能性があり、確率を反転すると改善余地がある

2. Python 3.13 での実装と可視化 #

まずは環境を確認し、必要なライブラリを導入します。

python --version        # 例: Python 3.13.0
pip install scikit-learn matplotlib

乳がん診断データセットにロジスティック回帰を適用し、ROC 曲線と AUC を描画します。図表は static/images/eval/classification/rocauc に保存され、generate_eval_assets.py から再利用できます。

from __future__ import annotations

from pathlib import Path

import matplotlib.pyplot as plt
from sklearn.datasets import load_breast_cancer
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import RocCurveDisplay, roc_auc_score
from sklearn.model_selection import train_test_split
from sklearn.pipeline import Pipeline, make_pipeline
from sklearn.preprocessing import StandardScaler


def build_pipeline(random_state: int = 42) -> Pipeline:
    """ロジスティック回帰で ROC 曲線を計算するためのパイプラインを構築する。"""
    return make_pipeline(
        StandardScaler(),
        LogisticRegression(max_iter=2000, solver="lbfgs", random_state=random_state),
    )


def plot_roc_curve() -> None:
    """ROC 曲線と AUC を計算し、可視化してファイルに保存する。"""
    features, labels = load_breast_cancer(return_X_y=True)
    x_train, x_test, y_train, y_test = train_test_split(
        features,
        labels,
        test_size=0.3,
        stratify=labels,
        random_state=42,
    )

    pipeline = build_pipeline()
    pipeline.fit(x_train, y_train)
    probabilities = pipeline.predict_proba(x_test)[:, 1]

    auc = roc_auc_score(y_test, probabilities)
    print(f"ROC-AUC: {auc:.3f}")

    figure, axis = plt.subplots(figsize=(5, 5))
    RocCurveDisplay.from_predictions(
        y_test,
        probabilities,
        name="Logistic Regression",
        ax=axis,
    )
    axis.plot([0, 1], [0, 1], "--", color="grey", alpha=0.5, label="Random")
    axis.set_xlabel("False Positive Rate")
    axis.set_ylabel("True Positive Rate")
    axis.set_title("ROC Curve (Breast Cancer Dataset)")
    axis.legend(loc="lower right")

    figure.tight_layout()
    output_dir = Path("static/images/eval/classification/rocauc")
    output_dir.mkdir(parents=True, exist_ok=True)
    figure.savefig(output_dir / "roc_curve.png", dpi=150)
    plt.close(figure)


if __name__ == "__main__":
    plot_roc_curve()

3. 閾値調整で確認したいこと #

Recall を重視するケース: 医療や不正検知など見逃しのコストが大きい場合、ROC 曲線で TPR を高く保ちながら許容できる FPR を探る。
精度と再現率のバランス: AUC が高いモデルほど、閾値 0.5 以外でも性能が安定しやすい。
複数モデルの比較: AUC が高いほど全体的な判別能力が高いと期待できるが、差が小さいときは PR-AUC や業務指標も併用する。

閾値を変更すると Precision-Recall のバランスも変わるため、ROC-AUC と PR-AUC を合わせて確認すると意思決定がしやすくなります。

4. 実運用でのチェックリスト #

データが大きく偏っていないか: AUC が 0.5 付近でも別の閾値で救える可能性がある。
クラス重みやサンプルウェイトの影響: 重みを調整して AUC が改善するか確認する。
ダッシュボード化の検討: 閾値を調整しやすいよう、ROC 曲線を共有して意思決定を支援する。
Python 3.13 ノートブックで再現可能に: モデル更新時にも同じ手順で計算・比較できるようにする。

まとめ #

ROC-AUC は閾値を横断的に評価できる指標で、0.5〜1.0 の範囲で判別性能を把握できる。
Python 3.13 + scikit-learn では RocCurveDisplay と roc_auc_score を組み合わせると簡潔に可視化と評価が可能。
閾値チューニングやモデル比較に活用し、Precision-Recall などの指標と併せて総合的に判断しよう。