Numerical

Prep

title: “Numerical Features | Preprocessing” weight: 3 created: 2019-03-09T08:04:12+09:00 lastmod: 2024-06-01T00:00:00+09:00 chapter: true not_use_colab: true not_use_twitter: true pre: “3.1 ” header_image: “/images/bg/canada3.jpg” #

Section 3.3 #

Numerical Features #

Common recipes for scaling, transforming distributions, capping outliers, imputing missing values, and selecting features.

TaskGoalTools
Standardize / NormalizeStabilize model behaviorStandardScaler, MinMaxScaler, RobustScaler
Power transformsMake data more normal‑likePowerTransformer, Box‑Cox, Yeo‑Johnson
Handle outliersReduce undue influenceWinsorization, IQR filter, IsolationForest
Missing valuesKeep rows, add indicatorsSimple/Iterative/KNN Imputer + indicator
Feature selectionReduce dimensionalityVariance threshold, PCA, model‑based

Guidelines: explore distributions, treat outliers thoughtfully, and build preprocessing into pipelines to avoid leakage.