Support-vector machines are supervised learning models with associated learning algorithms that analyze data used for classification and regression analysis.
An SVM model is a representation of the examples as points in space, mapped so that the examples of the separate categories are divided by a clear gap that is as wide as possible. New examples are then mapped into that same space and predicted to belong to a category based on the side of the gap on which they fall.
The objective of the support vector machine algorithm is to find a hyperplane in N-dimensional space(N — the number of features) that distinctly classifies the data points.
- Hyper Plane: In SVM this is basically the separation line between the data classes.
- Boundary line: In SVM there are two lines other than Hyper Plane which creates a margin. The support vectors can be on the Boundary lines or outside it. This boundary line separates the two classes.
- Support vectors: These are the data points that are closest to the boundary. The distance of the points is minimum or least.
How Linear SVM works?
SVMs finds a suitable separating line(or hyperplane) between data of two classes. SVM is an algorithm that takes the data as an input and outputs a line that separates those classes if possible. This can be explained from simple Linear SVM as below:
Let us suppose we have a training dataset of points of the form
where, each is a feature vector and are either or –, each indicating the class to which the point belongs. Our concern is to find the maximum-margin hyperplane that divides the group of points for which is or as shown in the figure below:
Any hyperplane can be written as the set of vector points that satisfies the following:
the normal vector to the hyperplane is denoted by and the parameter that determines the offset of the hyperplane from the origin along the above normal vector is:
We can select two parallel hyperplanes that separate the two classes of data so that the distance between them is as large as possible. The region bounded by these two hyperplanes is called margin, and the maximum margin hyperplane is the hyperplane that lies halfway between them. With a normalized or standardized dataset, these hyperplanes can be described by the equations:
Here to prevent data points from falling into the margin, for each point , we add the following constraint:
- if :
These constraints state that each data point must lie on the correct side of the margin. The above constraints can also be rewritten as:
We can put this together to get the optimization problem as “how to minimize the magnitude of vector subjected to the above relationship (1)?”
The vector and that solve this problem finally determine our classifier.
Implementation of Linear Support Vector Classification in Python
1. Import the necessary libraries/modules
Some essential python libraries are needed namely NumPy ( for some mathematical calculations), Pandas (for data loading and preprocessing) and some modules of Sklearn(for model development and prediction). Lets import other necessary libraries before we import modules of Sklearn:
#Import necessary libraries import numpy as np import pandas as pd
2. Import and Inspect the dataset
After importing necessary libraries, pandas function read_csv() is used to load the CSV file and store it as a pandas dataframe object. Then to inspect the dataset, head() function of the dataframe object is used as shown below. This dataset consists of logs which tells which of the users purchased/not purchased a particular product given other features (Id, Gender, age, estimated salary) as shown below:
#Import and Inspect the dataset dataset=pd.read_csv("master/Social_Network_Ads.csv") dataset.head()
3. Separate Dependent- Independent variables
After inspecting the dataset, the independent variable(X) and the dependent variable (y) are separated using iloc function for slicing as shown below. Our concern is to find the purchased or not value given Estimated Salary and Age from the above dataset. So the features Estimated Salary and Age (X) is the independent variable and Purchased(y) is the dependent variable with their values shown below.
#Separate Dependent and Independent variables X = dataset.iloc[:, [2, 3]].values y = dataset.iloc[:, 4].values print("X:\n",X) print("\ny:\n",y)
4. Split the dataset into train-test sets and Feature Scale
After separating the independent variable (X) and dependent variable(y), these values are split into train and test sets to train and evaluate the linear model. To split into test train sets test_train_split module of Sklearn is used with the test set 25 percent of available data as shown below. Here X_train and y_train are train sets and X_test and y_test are test sets. Also, the data is scaled using StandardScaler class form Sklearn that standardize features by removing the mean and scaling to unit variance as shown below:
# Splitting the dataset into the Training set and Test set from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 0) # Feature Scaling from sklearn.preprocessing import StandardScaler sc = StandardScaler() X_train = sc.fit_transform(X_train) X_test = sc.transform(X_test)
4. Fit SVM Classifier model to the dataset
After splitting the data into dependent and independent variables, the Support Vector Classifier model is fitted with train sets (ie X and y) using SVC class specifying the linear SVM as the kernel for linear classifier from Sklearn library as shown below:
# Fitting linear SVM to the Training set from sklearn.svm import SVC classifier = SVC(kernel = 'linear', random_state = 0) classifier.fit(X_train, y_train)
5. Predict the test results
Finally, the model is tested on test data and compared with the actual values and showing this on the confusion matrix as shown below:
# Predicting the Test set results y_pred = classifier.predict(X_test) # Making the Confusion Matrix from sklearn.metrics import confusion_matrix cm = confusion_matrix(y_test, y_pred) print(cm)
OUTPUT: [[66 2] [ 8 24]]
In this chapter, you got familiar with the Support Vector Machine Classifier along with its implementation in Python. Now head on to the next chapter in this course on Stochastic Gradient Descent Classifier.