NaiveBayes Class |
Namespace: Accord.MachineLearning.Bayes
[SerializableAttribute] public class NaiveBayes : NaiveBayes<GeneralDiscreteDistribution, int>
The NaiveBayes type exposes the following members.
Name | Description | |
---|---|---|
NaiveBayes(Int32, Int32) |
Constructs a new Naïve Bayes Classifier.
| |
NaiveBayes(Int32, Double, Int32) | Obsolete.
Obsolete.
|
Name | Description | |
---|---|---|
ClassCount | Obsolete.
Gets the number of possible output classes.
| |
Distributions |
Gets the probability distributions for each class and input.
| |
InputCount | Obsolete.
Gets the number of inputs in the model.
| |
NumberOfClasses |
Gets the number of classes expected and recognized by the classifier.
(Inherited from ClassifierBaseTInput, TClasses.) | |
NumberOfInputs |
Gets the number of inputs accepted by the model.
(Inherited from TransformBaseTInput, TOutput.) | |
NumberOfOutputs |
Gets the number of outputs generated by the model.
(Inherited from TransformBaseTInput, TOutput.) | |
NumberOfSymbols |
Gets the number of symbols for each input in the model.
| |
Priors |
Gets the prior beliefs for each class.
(Inherited from BayesTDistribution, TInput.) | |
SymbolCount | Obsolete.
Gets the number of symbols for each input in the model.
|
Name | Description | |
---|---|---|
Compute(Int32) | Obsolete.
Computes the most likely class for a given instance.
| |
Compute(Int32, Double, Double) | Obsolete.
Computes the most likely class for a given instance.
| |
Decide(TInput) |
Computes class-label decisions for a given set of input vectors.
(Inherited from ClassifierBaseTInput, TClasses.) | |
Decide(TInput) |
Computes a class-label decision for a given input.
(Inherited from MulticlassScoreClassifierBaseTInput.) | |
Decide(TInput, TClasses) |
Computes a class-label decision for a given input.
(Inherited from ClassifierBaseTInput, TClasses.) | |
Decide(TInput, Boolean) |
Computes class-label decisions for the given input.
(Inherited from MulticlassClassifierBaseTInput.) | |
Decide(TInput, Double) |
Computes class-label decisions for the given input.
(Inherited from MulticlassClassifierBaseTInput.) | |
Decide(TInput, Int32) |
Computes class-label decisions for the given input.
(Inherited from MulticlassClassifierBaseTInput.) | |
Decide(TInput, Double) |
Computes a class-label decision for a given input.
(Inherited from MulticlassClassifierBaseTInput.) | |
Equals | Determines whether the specified object is equal to the current object. (Inherited from Object.) | |
Estimate | Obsolete.
Obsolete.
| |
Finalize | Allows an object to try to free resources and perform other cleanup operations before it is reclaimed by garbage collection. (Inherited from Object.) | |
GetHashCode | Serves as the default hash function. (Inherited from Object.) | |
GetType | Gets the Type of the current instance. (Inherited from Object.) | |
Load(Stream) | Obsolete.
Loads a machine from a stream.
| |
Load(String) | Obsolete.
Loads a machine from a file.
| |
LoadTDistribution(Stream) | Obsolete.
Loads a machine from a stream.
| |
LoadTDistribution(String) | Obsolete.
Loads a machine from a file.
| |
LogLikelihood(TInput) |
Computes the log-likelihood that the given input
vector belongs to its most plausible class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
LogLikelihood(TInput) |
Computes the log-likelihood that the given input
vectors belongs to each of the possible classes.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
LogLikelihood(TInput, Int32) |
Predicts a class label vector for the given input vector, returning the
log-likelihood that the input vector belongs to its predicted class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
LogLikelihood(TInput, Double) |
Computes the log-likelihood that the given input
vectors belongs to each of the possible classes.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
LogLikelihood(TInput, Int32) |
Computes the log-likelihood that the given input vector
belongs to the specified classIndex.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
LogLikelihood(TInput, Int32) |
Computes the log-likelihood that the given input vector
belongs to the specified classIndex.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
LogLikelihood(TInput, Int32) |
Predicts a class label for each input vector, returning the
log-likelihood that each vector belongs to its predicted class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
LogLikelihood(TInput, Int32) |
Computes the log-likelihood that the given input vector
belongs to the specified classIndex.
(Inherited from BayesTDistribution, TInput.) | |
LogLikelihood(TInput, Int32, Double) |
Computes the log-likelihood that the given input vector
belongs to the specified classIndex.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
LogLikelihood(TInput, Int32, Double) |
Computes the log-likelihood that the given input vector
belongs to the specified classIndex.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
LogLikelihood(TInput, Int32, Double) |
Predicts a class label for each input vector, returning the
log-likelihood that each vector belongs to its predicted class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
LogLikelihoods(TInput) |
Computes the log-likelihood that the given input
vector belongs to each of the possible classes.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
LogLikelihoods(TInput) |
Computes the log-likelihood that the given input
vectors belongs to each of the possible classes.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
LogLikelihoods(TInput, Double) |
Computes the log-likelihood that the given input
vector belongs to each of the possible classes.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
LogLikelihoods(TInput, Int32) |
Predicts a class label vector for the given input vector, returning the
log-likelihoods of the input vector belonging to each possible class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
LogLikelihoods(TInput, Double) |
Computes the log-likelihood that the given input
vector belongs to each of the possible classes.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
LogLikelihoods(TInput, Int32) |
Computes the log-likelihood that the given input vector
belongs to the specified classIndex.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
LogLikelihoods(TInput, Int32) |
Predicts a class label vector for each input vector, returning the
log-likelihoods of the input vector belonging to each possible class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
LogLikelihoods(TInput, Int32, Double) |
Predicts a class label vector for the given input vector, returning the
log-likelihoods of the input vector belonging to each possible class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
LogLikelihoods(TInput, Int32, Double) |
Computes the log-likelihood that the given input vector
belongs to the specified classIndex.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
LogLikelihoods(TInput, Int32, Double) |
Predicts a class label vector for each input vector, returning the
log-likelihoods of the input vector belonging to each possible class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
MemberwiseClone | Creates a shallow copy of the current Object. (Inherited from Object.) | |
Normal(Int32, Int32) |
Constructs a new Naïve Bayes Classifier.
| |
Normal(Int32, Int32, NormalDistribution) |
Constructs a new Naïve Bayes Classifier.
| |
Normal(Int32, Int32, NormalDistribution) |
Constructs a new Naïve Bayes Classifier.
| |
Normal(Int32, Int32, NormalDistribution) |
Constructs a new Naïve Bayes Classifier.
| |
Normal(Int32, Int32, Double) |
Constructs a new Naïve Bayes Classifier.
| |
Normal(Int32, Int32, NormalDistribution, Double) |
Constructs a new Naïve Bayes Classifier.
| |
Probabilities(TInput) |
Computes the probabilities that the given input
vector belongs to each of the possible classes.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
Probabilities(TInput) |
Computes the probabilities that the given input
vector belongs to each of the possible classes.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
Probabilities(TInput, Double) |
Computes the probabilities that the given input
vector belongs to each of the possible classes.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
Probabilities(TInput, Int32) |
Predicts a class label vector for the given input vector, returning the
probabilities of the input vector belonging to each possible class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
Probabilities(TInput, Double) |
Computes the probabilities that the given input
vector belongs to each of the possible classes.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
Probabilities(TInput, Int32) |
Predicts a class label vector for each input vector, returning the
probabilities of the input vector belonging to each possible class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
Probabilities(TInput, Int32, Double) |
Predicts a class label vector for the given input vector, returning the
probabilities of the input vector belonging to each possible class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
Probabilities(TInput, Int32, Double) |
Predicts a class label vector for each input vector, returning the
probabilities of the input vector belonging to each possible class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
Probability(TInput) |
Predicts a class label for the given input vector, returning the
probability that the input vector belongs to its predicted class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
Probability(TInput) |
Predicts a class label for the given input vector, returning the
probability that the input vector belongs to its predicted class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
Probability(TInput, Int32) |
Computes the probability that the given input vector
belongs to the specified classIndex.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
Probability(TInput, Int32) |
Predicts a class label for the given input vector, returning the
probability that the input vector belongs to its predicted class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
Probability(TInput, Double) |
Predicts a class label for the given input vector, returning the
probability that the input vector belongs to its predicted class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
Probability(TInput, Int32) |
Computes the probability that the given input vector
belongs to the specified classIndex.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
Probability(TInput, Int32) |
Computes the probability that the given input vector
belongs to the specified classIndex.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
Probability(TInput, Int32) |
Predicts a class label for each input vector, returning the
probability that each vector belongs to its predicted class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
Probability(TInput, Int32, Double) |
Computes the probability that the given input vector
belongs to the specified classIndex.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
Probability(TInput, Int32, Double) |
Computes the probability that the given input vector
belongs to the specified classIndex.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
Probability(TInput, Int32, Double) |
Predicts a class label for each input vector, returning the
probability that each vector belongs to its predicted class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
Save(Stream) | Obsolete.
Saves the Naïve Bayes model to a stream.
| |
Save(String) | Obsolete.
Saves the Naïve Bayes model to a stream.
| |
Score(TInput) |
Computes a numerical score measuring the association between
the given input vector and its most strongly
associated class (as predicted by the classifier).
(Inherited from MulticlassScoreClassifierBaseTInput.) | |
Score(TInput) |
Computes a numerical score measuring the association between
the given input vector and its most strongly
associated class (as predicted by the classifier).
(Inherited from MulticlassScoreClassifierBaseTInput.) | |
Score(TInput, Int32) |
Computes a numerical score measuring the association between
the given input vector and a given
classIndex.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
Score(TInput, Int32) |
Predicts a class label for the input vector, returning a
numerical score measuring the strength of association of the
input vector to its most strongly related class.
(Inherited from MulticlassScoreClassifierBaseTInput.) | |
Score(TInput, Double) |
Computes a numerical score measuring the association between
the given input vector and its most strongly
associated class (as predicted by the classifier).
(Inherited from MulticlassScoreClassifierBaseTInput.) | |
Score(TInput, Int32) |
Computes a numerical score measuring the association between
the given input vector and a given
classIndex.
(Inherited from MulticlassScoreClassifierBaseTInput.) | |
Score(TInput, Int32) |
Computes a numerical score measuring the association between
the given input vector and a given
classIndex.
(Inherited from MulticlassScoreClassifierBaseTInput.) | |
Score(TInput, Int32) |
Predicts a class label for each input vector, returning a
numerical score measuring the strength of association of the
input vector to the most strongly related class.
(Inherited from MulticlassScoreClassifierBaseTInput.) | |
Score(TInput, Int32, Double) |
Computes a numerical score measuring the association between
the given input vector and a given
classIndex.
(Inherited from MulticlassScoreClassifierBaseTInput.) | |
Score(TInput, Int32, Double) |
Computes a numerical score measuring the association between
the given input vector and a given
classIndex.
(Inherited from MulticlassScoreClassifierBaseTInput.) | |
Score(TInput, Int32, Double) |
Predicts a class label for each input vector, returning a
numerical score measuring the strength of association of the
input vector to the most strongly related class.
(Inherited from MulticlassScoreClassifierBaseTInput.) | |
Scores(TInput) |
Computes a numerical score measuring the association between
the given input vector and each class.
(Inherited from MulticlassScoreClassifierBaseTInput.) | |
Scores(TInput) |
Computes a numerical score measuring the association between
the given input vector and each class.
(Inherited from MulticlassScoreClassifierBaseTInput.) | |
Scores(TInput, Double) |
Computes a numerical score measuring the association between
the given input vector and each class.
(Inherited from MulticlassScoreClassifierBaseTInput.) | |
Scores(TInput, Int32) |
Predicts a class label vector for the given input vector, returning a
numerical score measuring the strength of association of the input vector
to each of the possible classes.
(Inherited from MulticlassScoreClassifierBaseTInput.) | |
Scores(TInput, Double) |
Computes a numerical score measuring the association between
the given input vector and each class.
(Inherited from MulticlassScoreClassifierBaseTInput.) | |
Scores(TInput, Int32) |
Predicts a class label vector for each input vector, returning a
numerical score measuring the strength of association of the input vector
to each of the possible classes.
(Inherited from MulticlassScoreClassifierBaseTInput.) | |
Scores(TInput, Int32, Double) |
Predicts a class label vector for the given input vector, returning a
numerical score measuring the strength of association of the input vector
to each of the possible classes.
(Inherited from MulticlassScoreClassifierBaseTInput.) | |
Scores(TInput, Int32, Double) |
Predicts a class label vector for each input vector, returning a
numerical score measuring the strength of association of the input vector
to each of the possible classes.
(Inherited from MulticlassScoreClassifierBaseTInput.) | |
ToMulticlass |
Views this instance as a multi-class generative classifier.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
ToMultilabel |
Views this instance as a multi-label generative classifier,
giving access to more advanced methods, such as the prediction
of one-hot vectors.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
ToString | Returns a string that represents the current object. (Inherited from Object.) | |
Transform(TInput) |
Applies the transformation to an input, producing an associated output.
(Inherited from ClassifierBaseTInput, TClasses.) | |
Transform(TInput) |
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
(Inherited from TransformBaseTInput, TOutput.) | |
Transform(TInput, TClasses) |
Applies the transformation to an input, producing an associated output.
(Inherited from ClassifierBaseTInput, TClasses.) | |
Transform(TInput, Boolean) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBaseTInput.) | |
Transform(TInput, Int32) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBaseTInput.) | |
Transform(TInput, Boolean) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBaseTInput.) | |
Transform(TInput, Double) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBaseTInput.) | |
Transform(TInput, Int32) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBaseTInput.) | |
Transform(TInput, Double) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) | |
Transform(TInput, Double) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.) |
Name | Description | |
---|---|---|
HasMethod |
Checks whether an object implements a method with the given name.
(Defined by ExtensionMethods.) | |
IsEqual |
Compares two objects for equality, performing an elementwise
comparison if the elements are vectors or matrices.
(Defined by Matrix.) | |
To(Type) | Overloaded.
Converts an object into another type, irrespective of whether
the conversion can be done at compile time or not. This can be
used to convert generic types to numeric types during runtime.
(Defined by ExtensionMethods.) | |
ToT | Overloaded.
Converts an object into another type, irrespective of whether
the conversion can be done at compile time or not. This can be
used to convert generic types to numeric types during runtime.
(Defined by ExtensionMethods.) |
A naive Bayes classifier is a simple probabilistic classifier based on applying Bayes' theorem with strong (naive) independence assumptions. A more descriptive term for the underlying probability model would be "independent feature model".
In simple terms, a naive Bayes classifier assumes that the presence (or absence) of a particular feature of a class is unrelated to the presence (or absence) of any other feature, given the class variable. In spite of their naive design and apparently over-simplified assumptions, naive Bayes classifiers have worked quite well in many complex real-world situations.
This class implements a discrete (integer-valued) Naive-Bayes classifier. There is also a special named constructor to create classifiers assuming normal distributions for each variable. For arbitrary distribution classifiers, please see NaiveBayesTDistribution.
References:
In this example, we will be using the famous Play Tennis example by Tom Mitchell (1998). In Mitchell's example, one would like to infer if a person would play tennis or not based solely on four input variables. Those variables are all categorical, meaning that there is no order between the possible values for the variable (i.e. there is no order relationship between Sunny and Rain, one is not bigger nor smaller than the other, but are just distinct). Moreover, the rows, or instances presented below represent days on which the behavior of the person has been registered and annotated, pretty much building our set of observation instances for learning:
DataTable data = new DataTable("Mitchell's Tennis Example"); data.Columns.Add("Day", "Outlook", "Temperature", "Humidity", "Wind", "PlayTennis"); data.Rows.Add("D1", "Sunny", "Hot", "High", "Weak", "No"); data.Rows.Add("D2", "Sunny", "Hot", "High", "Strong", "No"); data.Rows.Add("D3", "Overcast", "Hot", "High", "Weak", "Yes"); data.Rows.Add("D4", "Rain", "Mild", "High", "Weak", "Yes"); data.Rows.Add("D5", "Rain", "Cool", "Normal", "Weak", "Yes"); data.Rows.Add("D6", "Rain", "Cool", "Normal", "Strong", "No"); data.Rows.Add("D7", "Overcast", "Cool", "Normal", "Strong", "Yes"); data.Rows.Add("D8", "Sunny", "Mild", "High", "Weak", "No"); data.Rows.Add("D9", "Sunny", "Cool", "Normal", "Weak", "Yes"); data.Rows.Add("D10", "Rain", "Mild", "Normal", "Weak", "Yes"); data.Rows.Add("D11", "Sunny", "Mild", "Normal", "Strong", "Yes"); data.Rows.Add("D12", "Overcast", "Mild", "High", "Strong", "Yes"); data.Rows.Add("D13", "Overcast", "Hot", "Normal", "Weak", "Yes"); data.Rows.Add("D14", "Rain", "Mild", "High", "Strong", "No");
Obs: The DataTable representation is not required, and instead the NaiveBayes could also be trained directly on integer arrays containing the integer codewords.
In order to estimate a discrete Naive Bayes, we will first convert this problem to a more simpler representation. Since all variables are categories, it does not matter if they are represented as strings, or numbers, since both are just symbols for the event they represent. Since numbers are more easily representable than text strings, we will convert the problem to use a discrete alphabet through the use of a codebook.
A codebook effectively transforms any distinct possible value for a variable into an integer symbol. For example, “Sunny” could as well be represented by the integer label 0, “Overcast” by “1”, Rain by “2”, and the same goes by for the other variables. So:
// Create a new codification codebook to // convert strings into discrete symbols Codification codebook = new Codification(data, "Outlook", "Temperature", "Humidity", "Wind", "PlayTennis"); // Extract input and output pairs to train DataTable symbols = codebook.Apply(data); int[][] inputs = symbols.ToArray<int>("Outlook", "Temperature", "Humidity", "Wind"); int[] outputs = symbols.ToArray<int>("PlayTennis");
Now that we already have our learning input/output pairs, we should specify our Bayes model. We will be trying to build a model to predict the last column, entitled “PlayTennis”. For this, we will be using the “Outlook”, “Temperature”, “Humidity” and “Wind” as predictors (variables which will we will use for our decision). Since those are categorical, we must specify, at the moment of creation of our Bayes model, the number of each possible symbols for those variables.
// Create a new Naive Bayes learning var learner = new NaiveBayesLearning(); // Learn a Naive Bayes model from the examples NaiveBayes nb = learner.Learn(inputs, outputs);
Now that we have created and estimated our classifier, we can query the classifier for new input samples through the Decide method.
// Consider we would like to know whether one should play tennis at a // sunny, cool, humid and windy day. Let us first encode this instance int[] instance = codebook.Translate("Sunny", "Cool", "High", "Strong"); // Let us obtain the numeric output that represents the answer int c = nb.Decide(instance); // answer will be 0 // Now let us convert the numeric output to an actual "Yes" or "No" answer string result = codebook.Translate("PlayTennis", c); // answer will be "No" // We can also extract the probabilities for each possible answer double[] probs = nb.Probabilities(instance); // { 0.795, 0.205 }
Please note that, while the example uses a DataTable to exemplify how data stored into tables can be loaded in the framework, it is not necessary at all to use DataTables in your own, final code. For example, please consider the same example shown above, but without DataTables:
string[] columnNames = { "Outlook", "Temperature", "Humidity", "Wind", "PlayTennis" }; string[][] data = { new string[] { "Sunny", "Hot", "High", "Weak", "No" }, new string[] { "Sunny", "Hot", "High", "Strong", "No" }, new string[] { "Overcast", "Hot", "High", "Weak", "Yes" }, new string[] { "Rain", "Mild", "High", "Weak", "Yes" }, new string[] { "Rain", "Cool", "Normal", "Weak", "Yes" }, new string[] { "Rain", "Cool", "Normal", "Strong", "No" }, new string[] { "Overcast", "Cool", "Normal", "Strong", "Yes" }, new string[] { "Sunny", "Mild", "High", "Weak", "No" }, new string[] { "Sunny", "Cool", "Normal", "Weak", "Yes" }, new string[] { "Rain", "Mild", "Normal", "Weak", "Yes" }, new string[] { "Sunny", "Mild", "Normal", "Strong", "Yes" }, new string[] { "Overcast", "Mild", "High", "Strong", "Yes" }, new string[] { "Overcast", "Hot", "Normal", "Weak", "Yes" }, new string[] { "Rain", "Mild", "High", "Strong", "No" }, }; // Create a new codification codebook to // convert strings into discrete symbols Codification codebook = new Codification(columnNames, data); // Extract input and output pairs to train int[][] symbols = codebook.Transform(data); int[][] inputs = symbols.Get(null, 0, -1); // Gets all rows, from 0 to the last (but not the last) int[] outputs = symbols.GetColumn(-1); // Gets only the last column // Create a new Naive Bayes learning var learner = new NaiveBayesLearning(); NaiveBayes nb = learner.Learn(inputs, outputs); // Consider we would like to know whether one should play tennis at a // sunny, cool, humid and windy day. Let us first encode this instance int[] instance = codebook.Translate("Sunny", "Cool", "High", "Strong"); // Let us obtain the numeric output that represents the answer int c = nb.Decide(instance); // answer will be 0 // Now let us convert the numeric output to an actual "Yes" or "No" answer string result = codebook.Translate("PlayTennis", c); // answer will be "No" // We can also extract the probabilities for each possible answer double[] probs = nb.Probabilities(instance); // { 0.795, 0.205 }
In this second example, we will be creating a simple multi-class classification problem using integer vectors and learning a discrete Naive Bayes on those vectors.
// Let's say we have the following data to be classified // into three possible classes. Those are the samples: // int[][] inputs = { // input output new int[] { 0, 1, 1, 0 }, // 0 new int[] { 0, 1, 0, 0 }, // 0 new int[] { 0, 0, 1, 0 }, // 0 new int[] { 0, 1, 1, 0 }, // 0 new int[] { 0, 1, 0, 0 }, // 0 new int[] { 1, 0, 0, 0 }, // 1 new int[] { 1, 0, 0, 0 }, // 1 new int[] { 1, 0, 0, 1 }, // 1 new int[] { 0, 0, 0, 1 }, // 1 new int[] { 0, 0, 0, 1 }, // 1 new int[] { 1, 1, 1, 1 }, // 2 new int[] { 1, 0, 1, 1 }, // 2 new int[] { 1, 1, 0, 1 }, // 2 new int[] { 0, 1, 1, 1 }, // 2 new int[] { 1, 1, 1, 1 }, // 2 }; int[] outputs = // those are the class labels { 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, }; // Let us create a learning algorithm var learner = new NaiveBayesLearning(); // and teach a model on the data examples NaiveBayes nb = learner.Learn(inputs, outputs); // Now, let's test the model output for the first input sample: int answer = nb.Decide(new int[] { 0, 1, 1, 0 }); // should be 1
Like all other learning algorithms in the framework, it is also possible to obtain a better measure of the performance of the Naive Bayes algorithm using cross-validation, as shown in the example below:
// Ensure we have reproducible results Accord.Math.Random.Generator.Seed = 0; // Let's say we have the following data to be classified // into three possible classes. Those are the samples: // int[][] inputs = { // input output new int[] { 0, 1, 1, 0 }, // 0 new int[] { 0, 1, 0, 0 }, // 0 new int[] { 0, 0, 1, 0 }, // 0 new int[] { 0, 1, 1, 0 }, // 0 new int[] { 0, 1, 0, 0 }, // 0 new int[] { 1, 0, 0, 0 }, // 1 new int[] { 1, 0, 0, 0 }, // 1 new int[] { 1, 0, 0, 1 }, // 1 new int[] { 0, 0, 0, 1 }, // 1 new int[] { 0, 0, 0, 1 }, // 1 new int[] { 1, 1, 1, 1 }, // 2 new int[] { 1, 0, 1, 1 }, // 2 new int[] { 1, 1, 0, 1 }, // 2 new int[] { 0, 1, 1, 1 }, // 2 new int[] { 1, 1, 1, 1 }, // 2 }; int[] outputs = // those are the class labels { 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, }; // Let's say we want to measure the cross-validation // performance of Naive Bayes on the above data set: var cv = CrossValidation.Create( k: 10, // We will be using 10-fold cross validation // First we define the learning algorithm: learner: (p) => new NaiveBayesLearning(), // Now we have to specify how the n.b. performance should be measured: loss: (actual, expected, p) => new ZeroOneLoss(expected).Loss(actual), // This function can be used to perform any special // operations before the actual learning is done, but // here we will just leave it as simple as it can be: fit: (teacher, x, y, w) => teacher.Learn(x, y, w), // Finally, we have to pass the input and output data // that will be used in cross-validation. x: inputs, y: outputs ); // After the cross-validation object has been created, // we can call its .Learn method with the input and // output data that will be partitioned into the folds: var result = cv.Learn(inputs, outputs); // We can grab some information about the problem: int numberOfSamples = result.NumberOfSamples; // should be 15 int numberOfInputs = result.NumberOfInputs; // should be 4 int numberOfOutputs = result.NumberOfOutputs; // should be 3 double trainingError = result.Training.Mean; // should be 0 double validationError = result.Validation.Mean; // should be 0.15 (+/- var. 0.11388888888888887) // If desired, compute an aggregate confusion matrix for the validation sets: GeneralConfusionMatrix gcm = result.ToConfusionMatrix(inputs, outputs);