title: “Numerical Features | Preprocessing” weight: 3 created: 2019-03-09T08:04:12+09:00 lastmod: 2024-06-01T00:00:00+09:00 chapter: true not_use_colab: true not_use_twitter: true pre: “3.1 ” header_image: “/images/bg/canada3.jpg” #
Section 3.3 #
Numerical Features #
Common recipes for scaling, transforming distributions, capping outliers, imputing missing values, and selecting features.
| Task | Goal | Tools |
|---|---|---|
| Standardize / Normalize | Stabilize model behavior | StandardScaler, MinMaxScaler, RobustScaler |
| Power transforms | Make data more normal‑like | PowerTransformer, Box‑Cox, Yeo‑Johnson |
| Handle outliers | Reduce undue influence | Winsorization, IQR filter, IsolationForest |
| Missing values | Keep rows, add indicators | Simple/Iterative/KNN Imputer + indicator |
| Feature selection | Reduce dimensionality | Variance threshold, PCA, model‑based |
Guidelines: explore distributions, treat outliers thoughtfully, and build preprocessing into pipelines to avoid leakage.