Bagging (Bootstrap Aggregating) trains multiple weak learners on bootstrap samples and averages or votes their predictions. When combined with decision trees, it leads directly to Random Forest.
1. Procedure #
- Create multiple bootstrap samples from the training data
- Train the same model on each sample
- Average predictions for regression or vote for classification
Bagging mainly reduces variance and makes the model more stable.
2. Python example #
import numpy as np
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import BaggingRegressor
X, y = fetch_california_housing(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
base = DecisionTreeRegressor(max_depth=None, random_state=0)
bagging = BaggingRegressor(
estimator=base,
n_estimators=100,
max_samples=0.8,
max_features=0.8,
bootstrap=True,
random_state=0,
)
bagging.fit(X_train, y_train)
pred = bagging.predict(X_test)
print("RMSE:", mean_squared_error(y_test, pred, squared=False))
print("OOB score:", bagging.oob_score_)
3. Hyperparameters #
n_estimators: Number of learners. More trees are more stable but costlier.max_samples,max_features: Fraction of samples/features per learner.bootstrap: Whether to sample with replacement;bootstrap_featuresdoes the same for features.oob_score: Estimate generalisation from out-of-bag samples.
4. Pros and cons #
| Pros | Cons |
|---|---|
| Easy to implement and parallelise | Must keep many models in memory |
| Greatly reduces variance | Does not reduce bias; weak learners must be decent |
| OOB estimate avoids an extra validation split | Less interpretable than a single tree |
5. Summary #
- Bagging stabilises models by resampling data and averaging predictions.
- Decision trees + bagging = Random Forest, so the relationship is worth remembering.
- Works well at scale when training can be parallelised.