| Products & Services | Solutions | Academia | Support | User Community | Company |
| Download Product Updates | | | Get Pricing | | | Trial Software |
| Documentation → Statistics Toolbox |
| Contents | Index |
| Learn more about Statistics Toolbox |
Create Naive Bayes classifier object by fitting training data
nb = NaiveBayes.fit(training, class)
nb = NaiveBayes.fit(..., 'param1',val1, 'param2',val2,
...)
nb = NaiveBayes.fit(training, class) builds a NaiveBayes classifier object nb. training is an N-by-D numeric matrix of training data. Rows of training correspond to observations; columns correspond to features. class is a classing variable for training (see Grouped Data) taking K distinct levels. Each element of class defines which class the corresponding row of training belongs to. training and class must have the same number of rows.
nb = NaiveBayes.fit(..., 'param1',val1, 'param2',val2, ...) specifies one or more of the following name/value pairs:
'Distribution' – a string or a 1-by-D cell vector of strings, specifying which distributions fit uses to model the data. If the value is a string, fit models all the features using one type of distribution. fit can also model different features using different types of distributions. If the value is a cell vector, its jth element specifies the distribution fit uses for the jth feature. The available types of distributions are:
| 'normal' (default) | Normal (Gaussian) distribution. |
| 'kernel' | Kernel smoothing density estimate. |
| 'mvmn' | Multivariate multinomial distribution for discrete data. fit assumes each individual feature follows a multinomial model within a class. The parameters for a feature include the probabilities of all possible values that the corresponding feature can take. |
| 'mn' | Multinomial distribution for classifying the count-based data such as the bag-of-tokens model. In the bag-of-tokens model, the value of the jth feature is the number of occurrences of the jth token in this observation, so it must be a non-negative integer. When 'mn' is used, fit considers each observation as multiple trials of a multinomial distribution, and considers each occurrence of a token as one trial. The number of categories (bins) in this multinomial model is the number of distinct tokens, i.e., the number of columns of training. |
'Prior' – The prior probabilities for the classes, specified as one of the following:
| 'empirical' (default) | fit estimates the prior probabilities from the relative frequencies of the classes in training. |
| 'uniform' | The prior probabilities are equal for all classes. |
| vector | A numeric vector of length K specifying the prior probabilities in the class order of class. |
| structure | A structure S containing class levels and
their prior probabilities. S must have two fields:
|
If the prior probabilities don't sum to one, fit will normalize them.
'KSWidth' – The bandwidth of the kernel smoothing window. The default is to select a default bandwidth automatically for each combination of feature and class, using a value that is optimal for a Gaussian distribution. You can specify the value as one of the following:
| scalar | Width for all features in all classes. |
| row vector | 1-by-D vector where the jth element is the bandwidth for the jth feature in all classes. |
| column vector | K-by-1 vector where the ith element specifies the bandwidth for all features in the ith class. K represents the number of class levels. |
| matrix | K-by-D matrix M where M(i,j) specifies the bandwidth for the jth feature in the ith class. |
| structure | A structure S containing class levels and
their bandwidths. S must have two fields:
|
'KSSupport' – The regions where the density can be applied. It can be a string, a two-element vector as shown below, or a 1-by-D cell array of these values:
| 'unbounded' (default) | The density can extend over the whole real line. |
| 'positive' | The density is restricted to positive values. |
| [L,U] | A two-element vector specifying the finite lower bound L and upper bound U for the support of the density. |
'KSType' – The type of kernel smoother to use. It can be a string or a 1-by-D cell array of strings. Each string can be 'normal' (default), 'box', 'triangle', or 'epanechnikov'.
Naive Bayes Classification, Grouped Data
![]() | gmdistribution.fit | fitdist | ![]() |

Includes the most popular MATLAB recorded presentations with Q&A sessions led by MATLAB experts.
| © 1984-2009- The MathWorks, Inc. - Site Help - Patents - Trademarks - Privacy Policy - Preventing Piracy - RSS |