seaborn.kdeplot can draw the joint density of two variables as contours or filled areas. It is especially helpful when a scatter plot becomes overcrowded.
import seaborn as sns
import matplotlib.pyplot as plt
penguins = sns.load_dataset("penguins").dropna(subset=["bill_length_mm", "bill_depth_mm"])
fig, ax = plt.subplots(figsize=(5.5, 4.5))
sns.kdeplot(
data=penguins,
x="bill_length_mm",
y="bill_depth_mm",
hue="species",
fill=True,
thresh=0.05,
levels=6,
alpha=0.6,
ax=ax,
)
ax.set_xlabel("Bill length (mm)")
ax.set_ylabel("Bill depth (mm)")
ax.set_title("2D KDE by penguin species")
ax.grid(alpha=0.2)
fig.tight_layout()
fig.savefig("static/images/visualize/distribution/kde2d.svg")
Reading tips #
- Tighter contours indicate denser regions, and the color intensity provides an intuitive sense of frequency.
- Tune
threshto drop very low-density rings and keep the figure clean. - For very large data sets, KDE can be expensive, so consider sampling or adjusting
bw_adjustto widen the bandwidth.