Visualize cumulative distributions with ECDF

Visualize

Visualize cumulative distributions with ECDF

Created: Last updated: Read time: 1 min

An empirical cumulative distribution function (ECDF) is a simple chart that shows the share of samples below any given value. It is handy for making threshold decisions.

import seaborn as sns
import matplotlib.pyplot as plt

tips = sns.load_dataset("tips")

fig, ax = plt.subplots(figsize=(6, 4))
sns.ecdfplot(data=tips, x="total_bill", hue="time", ax=ax)

ax.set_xlabel("Bill total ($)")
ax.set_ylabel("Cumulative share")
ax.set_title("ECDF of bill totals")
ax.grid(alpha=0.2)

fig.tight_layout()
fig.savefig("static/images/visualize/distribution/ecdf.svg")

Great for reading off thresholds such as what share is below $30.

Reading tips #

  • Segments where the slope is steep indicate that many samples cluster there, while flat portions mean the values are spread out.
  • Statements such as “80% of customers spend less than $30” become easy to justify.
  • When comparing many series, limit the number of colors and rely on legends and line styles for clarity.