Introduction to KNN

Introduction To KNN

it is one of simplest and power full predictive modeling technique
this algorithm is also called as lazy learning algorithm 
in this algorithm the prediction for the new data point is given by looking at it's neighbors

Building KNN model

 In this cause we first plot the graph of the data set after that when the new instance is located we calculate the distance from all data point near that new instance short the data list in ascending order choose firsts K distance  from the sorted data list in cause of of classification problem we consider the mode of the distances and in cause of regression problem we consider mean of the distances

Determining the value of the K .....?

Elbow method :- First of all we Choose the range of the k and k belongs in between k=1 to n where
n = number of data point the data set have
after that we implement the KNN model for the every value of the range is selected
after that we select the value generated by the model for the every value of the K and plot it
the ploted graph become like this here we have the range of K value on the x axis and the range of error on the Y axis so looking at the graph we select the value of k with minimum error or where it point the elbow

How to Calculate the Distance :-

  • Manhattan Distance
Sum of absolute differences between the two points, across all dimensions
Manhattan Distance Metric
Manhattan Distance is not the shortest distance between two points
  • Euclidean Distance \
it will calculate the shortest distance between the two point
What is the difference between Euclidean, Manhattan and Hamming ...
  • Minkowiski Distance
Distance Measures Tan et al. From Chapter 2. Similarity and ...\
the Value of K=1 is for the Manhattan Distance
the value of k = 2 is for the Euclidean distance
  • Hamming Distance
it is used when we have the categorical value
Total number of difference between the two string of identical length

KNN- DISTANCE METRICS | Data Vedas

Issues with Distance Based Algorithms :-

  • Take a distance between points into accounts
  • Fails when variable have different scales
Solution .. scaling all the Features at the same scale


Comments

Popular Posts