12.1 Classical Kernel Method
A kernel method is the key element of a powerful classical supervised learning algorithm: Support Vector Machine (SVM). Unlike a feedforward neural network based classifier whose objective is to minimise the classification error, the SVM’s objective is to maximise the margin, defined as the distance between a separating hyperplane (decision boundary separating samples belonging to different classes) and the training samples that are closest to this hyperplane [264]. The samples that are closest to the separating hyperplane are called support vectors, thus giving its name to the algorithm.
The maximisation of the margins lowers the generalisation error and helps fight overfitting. This is a very important property but finding the separating hyperplane is not an easy task for non-linearly separable data. Fortunately, the kernel method allows us to overcome this difficulty, by creating non-linear combinations of the original features and projecting...