Identify key drivers quickly with a Pareto chart

Visualize

Identify key drivers quickly with a Pareto chart

Created: Last updated: Read time: 1 min

For defect causes or inquiry categories, a Pareto chart is the classic way to show cumulative contribution. Bars plus a cumulative line make the 80/20 breakpoint clear.

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import FuncFormatter

categories = ["Misconfiguration", "Unknown operation", "Bug", "Spec question", "Integration error", "Other"]
counts = np.array([120, 95, 70, 45, 30, 18])
sorted_idx = np.argsort(counts)[::-1]
counts = counts[sorted_idx]
categories = [categories[i] for i in sorted_idx]

cumulative = counts.cumsum() / counts.sum()

fig, ax1 = plt.subplots(figsize=(6.4, 4))
ax1.bar(categories, counts, color="#38bdf8")
ax1.set_ylabel("Count")
ax1.set_title("Pareto analysis of inquiry categories")
ax1.grid(axis="y", alpha=0.2)

ax2 = ax1.twinx()
ax2.plot(categories, cumulative, color="#ef4444", marker="o")
ax2.set_ylabel("Cumulative share")
ax2.set_ylim(0, 1.05)
ax2.yaxis.set_major_formatter(FuncFormatter(lambda x, _: f"{x:.0%}"))

threshold = np.argmax(cumulative >= 0.8)
ax2.axhline(0.8, color="#475569", linestyle="--", linewidth=1)
ax1.axvline(threshold + 0.5, color="#475569", linestyle=":", linewidth=1)

fig.tight_layout()

plt.show()

Bars and the cumulative line reveal the 80/20 breakpoint.

Reading tips #

  • Bars show counts while the line shows cumulative contribution.
  • The 80% line highlights the categories that deserve priority action.
  • If the cumulative line rises slowly, causes are dispersed and cross-cutting fixes are needed.