Support Vector Machine Classifier With Python Full Tutorial


Support-vector machines are supervised learning models with associated learning algorithms that analyze data used for classification and regression analysis.

An SVM model is a representation of the examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible. New examples are then mapped into that same space and predicted to belong to a category based on the side of the gap on which they fall.

The objective of the support vector machine algorithm is to find a hyperplane in N-dimensional space(N — the number of features) that distinctly classifies the data points.

Basics terms:

  • Hyper Plane: In SVM this is basically the separation line between the data classes.
  • Boundary line: In SVM there are two lines other than Hyper Plane which creates a margin. The support vectors can be on the Boundary lines or outside it. This boundary line separates the two classes.
  • Support vectors: These are the data points that are closest to the boundary. The distance of the points is minimum or least.
SVM image by javapoint

How Linear SVM works?

SVMs finds a suitable separating line(or hyperplane) between data of two classes. SVM is an algorithm that takes the data as an input and outputs a line that separates those classes if possible. This can be explained from simple Linear SVM as below:

Let us suppose we have a training dataset of n points of the form

(\vec{x_1}, y_1), \dots, (\vec{x_n}, y_n)

where, each x_i is a feature vector and y_i are either 1 or –1, each indicating the class to which the point X_i belongs. Our concern is to find the maximum-margin hyperplane that divides the group of points x_i for which y_i is 1 or -1 as shown in the figure below:

svm image wiki

Any hyperplane can be written as the set of vector points x that satisfies the following:

the normal vector to the hyperplane is denoted by \vec{w} and the parameter that determines the offset of the hyperplane from the origin along the above normal vector is:

We can select two parallel hyperplanes that separate the two classes of data so that the distance between them is as large as possible. The region bounded by these two hyperplanes is called margin, and the maximum margin hyperplane is the hyperplane that lies halfway between them. With a normalized or standardized dataset, these hyperplanes can be described by the equations:

[Anything on or above this boundary is of class with label 1]
[Anything on or above this boundary is of class with label -1]

Here to prevent data points from falling into the margin, for each point i, we add the following constraint:

  • if y_i = 1 :
    \vec{w}.\vec{x_i} - b \geq 1
  • if y_i = -1 :
    \vec{w}.\vec{x_i} - b \leq -1

These constraints state that each data point must lie on the correct side of the margin. The above constraints can also be rewritten as:

We can put this together to get the optimization problem as “how to minimize the magnitude of vector w subjected to the above relationship (1)?”

The vector w and b that solve this problem finally determine our classifier.

Implementation of Linear Support Vector Classification in Python

1. Import the necessary libraries/modules

Some essential python libraries are needed namely NumPy ( for some mathematical calculations), Pandas (for data loading and preprocessing) and some modules of Sklearn(for model development and prediction). Lets import other necessary libraries before we import modules of Sklearn:

#Import necessary libraries
import numpy as np
import pandas as pd

2. Import and Inspect the dataset

After importing necessary libraries, pandas function read_csv() is used to load the CSV file and store it as a pandas dataframe object. Then to inspect the dataset, head() function of the dataframe object is used as shown below. This dataset consists of logs which tells which of the users purchased/not purchased a particular product given other features (Id, Gender, age, estimated salary) as shown below:

#Import and Inspect the dataset

3. Separate Dependent- Independent variables

After inspecting the dataset, the independent variable(X) and the dependent variable (y) are separated using iloc function for slicing as shown below. Our concern is to find the purchased or not value given Estimated Salary and Age from the above dataset. So the features Estimated Salary and Age (X) is the independent variable and Purchased(y) is the dependent variable with their values shown below.

#Separate Dependent and Independent variables
X = dataset.iloc[:, [2, 3]].values
y = dataset.iloc[:, 4].values


4. Split the dataset into train-test sets and Feature Scale

After separating the independent variable (X) and dependent variable(y), these values are split into train and test sets to train and evaluate the linear model. To split into test train sets test_train_split module of Sklearn is used with the test set 25 percent of available data as shown below. Here X_train and y_train are train sets and X_test and y_test are test sets. Also, the data is scaled using StandardScaler class form Sklearn that standardize features by removing the mean and scaling to unit variance as shown below:

# Splitting the dataset into the Training set and Test set
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 0)

# Feature Scaling
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

4. Fit SVM Classifier model to the dataset

After splitting the data into dependent and independent variables, the Support Vector Classifier model is fitted with train sets (ie X and y) using SVC class specifying the linear SVM as the kernel for linear classifier from Sklearn library as shown below:

# Fitting linear SVM to the Training set
from sklearn.svm import SVC
classifier = SVC(kernel = 'linear', random_state = 0), y_train)

5. Predict the test results

Finally, the model is tested on test data and compared with the actual values and showing this on the confusion matrix as shown below:

# Predicting the Test set results
y_pred = classifier.predict(X_test)

# Making the Confusion Matrix
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
 [[66  2] 
 [ 8 24]] 

In Conclusion

In this chapter, you got familiar with the Support Vector Machine Classifier along with its implementation in Python. Now head on to the next chapter in this course on Stochastic Gradient Descent Classifier.


Please enter your comment!
Please enter your name here