- statistics: obj = fitcknn (X, Y)
- statistics: obj = fitcknn (…, name, value)
 Fit a k-Nearest Neighbor classification model.
 obj = fitcknn (X, Y) returns a k-Nearest Neighbor
 classification model, obj, with X being the predictor data,
 and Y the class labels of observations in X.
 
- 
 Xmust be a  numeric matrix of input data where rows
 correspond to observations and columns correspond to features or variables.
 X will be used to train the kNN model.
- 
 Yis  matrix or cell matrix containing the class labels of
 corresponding predictor data in X. Y can contain any type of
 categorical data. Y must have same numbers of Rows as X.
 obj = fitcknn (…, name, value) returns a
 k-Nearest Neighbor classification model with additional options specified by
 Name-Value pair arguments listed below.
 
|  | Name | Value | 
|---|
| "PredictorNames" |  | A cell array of character vectors
 specifying the predictor variable names.  The variable names are assumed to
 be in the same order as they appear in the training data X. | 
| "ResponseName" |  | A character vector specifying the name
 of the response variable. | 
| "ClassNames" |  | A cell array of character vectors
 specifying the names of the classes in the training data Y. | 
| "BreakTies" |  | Tie-breaking algorithm used by predict
 when multiple classes have the same smallest cost. By default, ties occur
 when multiple classes have the same number of nearest points among the
  nearest neighbors. The available options are specified by the
 following character arrays: | 
 
|  | "smallest" | This is the default and it favors the
 class with the smallest index among the tied groups, i.e. the one that
 appears first in the training labelled data. | 
|  | "nearest" | This favors the class with the nearest
 neighbor among the tied groups, i.e. the class with the closest member point
 according to the distance metric used. | 
|  | "nearest" | This randomly picks one class among the
 tied groups. | 
 
| "BucketSize" |  | The maximum number of data points in the
 leaf node of the Kd-tree and it must be a positive integer.  By default, it
 is 50. This argument is meaningful only when the selected search method is "kdtree". | 
| "Cost" |  | A  numeric matrix containing
 misclassification cost for the corresponding instances in X where
  is the number of unique categories in Y.  If an instance is
 correctly classified into its category the cost is calculated to be 1, If
 not then 0. cost matrix can be altered use obj.cost = somecost.
 default valuecost = ones(rows(X),numel(unique(Y))). | 
| "Prior" |  | A numeric vector specifying the prior
 probabilities for each class.  The order of the elements in Priorcorresponds to the order of the classes inClassNames. | 
| "NumNeighbors" |  | A positive integer value specifying
 the number of nearest neighbors to be found in the kNN search.  By default,
 it is 1. | 
| "Exponent" |  | A positive scalar (usually an integer)
 specifying the Minkowski distance exponent.  This argument is only valid when
 the selected distance metric is "minkowski".  By default it is 2. | 
| "Scale" |  | A nonnegative numeric vector specifying the
 scale parameters for the standardized Euclidean distance.  The vector length
 must be equal to the number of columns in X.  This argument is only
 valid when the selected distance metric is "seuclidean", in which
 case each coordinate of X is scaled by the corresponding element of"scale", as is each query point in Y.  By default, the scale
 parameter is the standard deviation of each coordinate in X.  If a
 variable in X is constant, i.e. zero variance, this value is forced
 to 1 to avoid division by zero.  This is the equivalent of this variable not
 being standardized. | 
| "Cov" |  | A square matrix with the same number of columns
 as X specifying the covariance matrix for computing the mahalanobis
 distance.  This must be a positive definite matrix matching.  This argument
 is only valid when the selected distance metric is "mahalanobis". | 
| "Distance" |  | is the distance metric used by knnsearchas specified below: | 
 
|  | "euclidean" | Euclidean distance. | 
|  | "seuclidean" | standardized Euclidean distance.  Each
 coordinate difference between the rows in X and the query matrix
 Y is scaled by dividing by the corresponding element of the standard
 deviation computed from X.  To specify a different scaling, use the "Scale"name-value argument. | 
|  | "cityblock" | City block distance. | 
|  | "chebychev" | Chebychev distance (maximum coordinate
 difference). | 
|  | "minkowski" | Minkowski distance.  The default exponent
 is 2.  To specify a different exponent, use the "P"name-value
 argument. | 
|  | "mahalanobis" | Mahalanobis distance, computed using a
 positive definite covariance matrix.  To change the value of the covariance
 matrix, use the "Cov"name-value argument. | 
|  | "cosine" | Cosine distance. | 
|  | "correlation" | One minus the sample linear correlation
 between observations (treated as sequences of values). | 
|  | "spearman" | One minus the sample Spearman’s rank
 correlation between observations (treated as sequences of values). | 
|  | "hamming" | Hamming distance, which is the percentage
 of coordinates that differ. | 
|  | "jaccard" | One minus the Jaccard coefficient, which is
 the percentage of nonzero coordinates that differ. | 
|  | @distfun | Custom distance function handle.  A distance
 function of the form function D2 = distfun (XI, YI),
 where XI is a  vector containing a single observation in
 -dimensional space, YI is an  matrix containing an
 arbitrary number of observations in the same -dimensional space, and
 D2 is an  vector of distances, where(D2k)is
 the distance between observations XI and(YIk,:). | 
 
| "DistanceWeight" |  | A distance weighting function,
 specified either as a function handle, which accepts a matrix of nonnegative
 distances and returns a matrix the same size containing nonnegative distance
 weights, or one of the following values: "equal", which corresponds
 to no weighting;"inverse", which corresponds to a weight equal to
 ;"squaredinverse", which corresponds to a weight
 equal to . | 
| "IncludeTies" |  | A boolean flag to indicate if the
 returned values should contain the indices that have same distance as the
  neighbor.  When false,knnsearchchooses the
 observation with the smallest index among the observations that have the same
 distance from a query point.  Whentrue,knnsearchincludes
 all nearest neighbors whose distances are equal to the  smallest
 distance in the output arguments. To specify , use the"K"name-value pair argument. | 
| "NSMethod" |  | is the nearest neighbor search method used
 by knnsearchas specified below. | 
 
|  | "kdtree" | Creates and uses a Kd-tree to find nearest
 neighbors. "kdtree"is the default value when the number of columns
 in X is less than or equal to 10, X is not sparse, and the
 distance metric is"euclidean","cityblock","manhattan","chebychev", or"minkowski".  Otherwise,
 the default value is"exhaustive".  This argument is only valid when
 the distance metric is one of the four aforementioned metrics. | 
|  | "exhaustive" | Uses the exhaustive search algorithm by
 computing the distance values from all the points in X to each point in
 Y. | 
 See also: 
  ClassificationKNN, 
  knnsearch, 
  rangesearch, 
  pdist2
Source Code: 
  fitcknn