Data Analysis Specialised Reading: Kernel Trick

Posted Apr 3, 2024

By Nur Alya Dania binti Moriazi 2 min read

Kernel Trick

Mathematical technique that maps data from one space to another space → uses Kernel function ⇒ takes as vectors in the original space as its inputs and returns the dot product of the vectors in the feature space
- If we have data $\textbf{x}, \textbf{z} \in X$ and a map $\phi: X → \mathbb{R}^N$ then $k(\textbf{x}, \textbf{z}) = \left<\phi(\textbf{x}), \phi(\textbf{z})\right>$ is a kernel function
  - consider each coordinate of the transformed vector $\phi(\textbf{x})$ is just some function oft he coordinates in the corresponding lower dimensional vector $\textbf{x}$
Work with non-linearly separable data → map data points to a higher dimensional space where they become linearly separable
- ex: mapping data from the 2nd dimension to a 3rd dimensional space so that they can be linearly separated by a plane (rather than a line)
Generally used in SVMs → find the hyperplane that best separates the data points of different classes

How It Works

Data Preprocessing: Data cleaning & normalisation
Kernel Function Selection: Selecting the best kernel function
Kernel Matrix Computation: Compute the kernel matrix that measures the similarity between the data pointers in the input space
Feature Space Mapping: Map the data into higher-dimensional space
Model Training: Train model on mapped data

Types of Kernel Methods in SVM

Linear Kernel: used when the data is linearly separable $k(\textbf{x}, \textbf{z}) = \textbf{x} \cdot \textbf{z}$
Polynomial Kernel: used when the data is not linearly separable $k(\textbf{x}, \textbf{z}) = (\textbf{x} \cdot \textbf{z}+1)d$
Radial-basis Function (RBF) Kernel: maps input data to an infinite-dimensional space $k(\textbf{x}, \textbf{z}) = e^{-\gamma\left\Vert \textbf{x}-\textbf{z} \right\Vert^2}$
- The Gaussian Kernel is a type of RBF kernel
Sigmoid Kernel: transforms input data using the Sigmoid kernel $k(\textbf{x}, \textbf{z}) = \tanh{(\alpha \textbf{x}^T\textbf{z} + \beta)}$
Laplacian Kernel: similar to RBF but has a sharper peak and faster decay $k(\textbf{x}, \textbf{z}) = e^{-\left\Vert \textbf{x}-\textbf{z} \right\Vert}$
ANOVA Radial Basis Kernel: multiple-input kernel function → can be used for feature selection $k(\textbf{x}, \textbf{z}) = 1ne^{-(\textbf{x}k-\textbf{z}k)2}d$
Exponential Kernel: similar to RBF but decays much faster $k(\textbf{x}, \textbf{z}) = e^{-\gamma\left\Vert \textbf{x}-\textbf{z} \right\Vert^2_2}$
Wavelet Kernel: non-stationary kernel function that can be used for time-series analysis $k(\textbf{x}, \textbf{z}) = \sum{\phi(i,j)\psi(x(i), z(j))}$
Spectral Kernel: based on eigenvalues and eigenvectors of a similarity matrix $k(\textbf{x}, \textbf{z}) = \sum{\lambda_i\phi_i(\textbf{x})\phi_i(\textbf{z})}$
Mahalonibus Kernel: takes into account the covariance structure of the data $k(\textbf{x}, \textbf{z}) = e^{-\frac{1}{2}(\textbf{x}-\textbf{z})^TS^{-1}(\textbf{x}-\textbf{z})}$

Choosing the Right Kernel Function

Understand the problem: Understand the type of data, features, and the complexity of the relationship between the features
Choose a simple kernel function: Start with the linear kernel function as a baseline to compare with the more complex ones
Test different functions: Play around with other kernel functions and compare their performance
Tune the parameters: Experiment with different parameter values and choose the values that results in the best performance
Use domain knowledge: Based on the type of data, use the domain knowledge and choose the appropriate type of kernel for the data
Consider computational complexity: Take into consideration the resources that would be required for larger data sets

Study, Data Science

KHUDA ML DA

This post is licensed under CC BY 4.0 by the author.

Kernel Trick

How It Works

Types of Kernel Methods in SVM

Choosing the Right Kernel Function

Trending Tags