2.1.10
Support Vector Regression (SVR)
Summary
- Support Vector Regression extends SVMs to regression, treating errors within an ε-insensitive tube as zero to reduce outlier impact.
- Kernel methods enable flexible non-linear relationships while keeping the model compact via support vectors.
- Hyperparameters
C,epsilon, andgammagovern the balance between generalization and smoothness. - Feature scaling is essential; wrapping preprocessing and learning in a pipeline ensures consistent transformations.
Intuition #
This method should be interpreted through its assumptions, data conditions, and how parameter choices affect generalization.
Detailed Explanation #
Mathematical formulation #
The optimization problem is
$$ \min_{\mathbf{w}, b, \boldsymbol{\xi}, \boldsymbol{\xi}^*} \frac{1}{2} \|\mathbf{w}\|^2 + C \sum_{i=1}^{n} (\xi_i + \xi_i^*) $$subject to
$$ \begin{aligned} y_i - (\mathbf{w}^\top \phi(\mathbf{x}_i) + b) &\le \epsilon + \xi_i, \\ (\mathbf{w}^\top \phi(\mathbf{x}_i) + b) - y_i &\le \epsilon + \xi_i^*, \\ \xi_i, \xi_i^* &\ge 0, \end{aligned} $$where \(\phi\) maps inputs into a feature space via the chosen kernel. Solving the dual yields the support vectors and coefficients.
Experiments with Python #
This example demonstrates SVR combined with StandardScaler in a pipeline.
| |
Reading the results #
- The pipeline scales training data using its mean and variance, then applies the same transform to the test set.
predcontains predictions for the test features; tuningepsilonandCadjusts the trade-off between overfitting and underfitting.- Increasing the RBF kernel’s
gammafocuses on local patterns, whereas smaller values produce smoother functions.
References #
- Smola, A. J., & Schölkopf, B. (2004). A Tutorial on Support Vector Regression. Statistics and Computing, 14(3), 199–222.
- Vapnik, V. (1995). The Nature of Statistical Learning Theory. Springer.