An Introduction to Machine Learning | The Complete Guide
Data Preprocessing for Machine Learning | Apply All the Steps in Python
Regression
Learn Simple Linear Regression in the Hard Way(with Python Code)
Multiple Linear Regression in Python (The Ultimate Guide)
Polynomial Regression in Two Minutes (with Python Code)
Support Vector Regression Made Easy(with Python Code)
Decision Tree Regression Made Easy (with Python Code)
Random Forest Regression in 4 Steps(with Python Code)
4 Best Metrics for Evaluating Regression Model Performance
Classification
A Beginners Guide to Logistic Regression(with Example Python Code)
K-Nearest Neighbor in 4 Steps(Code with Python & R)
Support Vector Machine(SVM) Made Easy with Python
Kernel SVM for Dummies(with Python Code)
Naive Bayes Classification Just in 3 Steps(with Python Code)
Decision Tree Classification for Dummies(with Python Code)
Random forest Classification
Evaluating Classification Model performance
A Simple Explanation of K-means Clustering in Python
Hierarchical Clustering
Association Rule Learning | Apriori
Eclat Intuition
Reinforcement Learning in Machine Learning
Upper Confidence Bound (UCB) Algorithm: Solving the Multi-Armed Bandit Problem
Thompson Sampling Intuition
Artificial Neural Networks
Natural Language Processing
Deep Learning
Principal Component Analysis
Linear Discriminant Analysis (LDA)
Kernel PCA
Model Selection & Boosting
K-fold Cross Validation in Python | Master this State of the Art Model Evaluation Technique
XGBoost
Convolution Neural Network
Dimensionality Reduction

Kernel PCA | Machine Learning

Written by- Sharif Machine Learning Tutorials

Kernel Principal Component Analysis(Kernel PCA): Principal component analysis (PCA) is a popular tool for dimensionality reduction and feature extraction for a linearly separable dataset. But if the dataset is not linearly separable, we need to apply the Kernel PCA algorithm. It is similar to PCA except that it uses one of the kernel tricks to first map the non-linear features to a higher dimension, then it extracts the principal components as same as PCA.

Kernel PCA in Python: In this tutorial, we are going to implement the Kernel PCA alongside with a Logistic Regression algorithm on a nonlinear dataset. For this task, we will use the "Social_Network_Ads.csv" dataset. In the dataset, the features have a non-linear correlation with the dependent variable. So, we have to apply Kernel PCA to extract the independent variables. Let's have a glimpse of that dataset.

                                                                  

You can download the whole dataset from here.

First of all, Let's import the essential libraries

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

Importing the dataset

dataset = pd.read_csv('Social_Network_Ads.csv')
X = dataset.iloc[:, [2, 3]].values
y = dataset.iloc[:, 4].values


Splitting the dataset into the Training set and Test set

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.25, random_state = 0)


 Feature Scaling

from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)


Applying Kernel PCA

from sklearn.decomposition import KernelPCA
kpca = KernelPCA(n_components = 2, kernel = 'rbf')
X_train = kpca.fit_transform(X_train)
X_test = kpca.transform(X_test)

Note: Here, n_components parameter defines the number of independent variables we want in our model (here, it is two) and we choose RBF(Radial Basis Function) kernel as our kernel function.

 Fitting Logistic Regression to the Training set

from sklearn.linear_model import LogisticRegression
classifier = LogisticRegression(random_state = 0)
classifier.fit(X_train, y_train)


Predicting the Test set results

y_pred = classifier.predict(X_test)


Making the Confusion Matrix

from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)


                                               


From the above confusion matrix, we can see that the model has an accuracy of 80%

Now, let's visualize both the training and test set results.

Visualising the Training set results

from matplotlib.colors import ListedColormap
X_set, y_set = X_train, y_train
X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1, stop = X_set[:, 0].max() + 1, step = 0.01),
                    np.arange(start = X_set[:, 1].min() - 1, stop = X_set[:, 1].max() + 1, step = 0.01))
plt.contourf(X1, X2, classifier.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
            alpha = 0.75, cmap = ListedColormap(('red', 'green')))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_set)):
   plt.scatter(X_set[y_set == j, 0], X_set[y_set == j, 1],
               c = ListedColormap(('red', 'green'))(i), label = j)
plt.title('Logistic Regression (Training set)')
plt.xlabel('Age')
plt.ylabel('Estimated Salary')
plt.legend()
plt.show()

The graph will look like the following:

                                                                   

                                                               

 Visualising the Test set results

from matplotlib.colors import ListedColormap
X_set, y_set = X_test, y_test
X1, X2 = np.meshgrid(np.arange(start = X_set[:, 0].min() - 1, stop = X_set[:, 0].max() + 1, step = 0.01),
                    np.arange(start = X_set[:, 1].min() - 1, stop = X_set[:, 1].max() + 1, step = 0.01))
plt.contourf(X1, X2, classifier.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
            alpha = 0.75, cmap = ListedColormap(('red', 'green')))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_set)):
   plt.scatter(X_set[y_set == j, 0], X_set[y_set == j, 1],
               c = ListedColormap(('red', 'green'))(i), label = j)
plt.title('Logistic Regression (Test set)')
plt.xlabel('Age')
plt.ylabel('Estimated Salary')
plt.legend()
plt.show()

The graph will look like the following:

                                                                       




© aionlinecourse.com All rights reserved.