Click or drag to resize
Accord.NET (logo)

NaiveBayes Class

Naïve Bayes Classifier.
Inheritance Hierarchy
SystemObject
  Accord.MachineLearningTransformBaseInt32, Int32
    Accord.MachineLearningClassifierBaseInt32, Int32
      Accord.MachineLearningMulticlassClassifierBaseInt32
        Accord.MachineLearningMulticlassScoreClassifierBaseInt32
          Accord.MachineLearningMulticlassLikelihoodClassifierBaseInt32
            Accord.MachineLearning.BayesBayesIndependentGeneralDiscreteDistribution, Int32, Int32
              Accord.MachineLearning.BayesNaiveBayesGeneralDiscreteDistribution, Int32
                Accord.MachineLearning.BayesNaiveBayes

Namespace:  Accord.MachineLearning.Bayes
Assembly:  Accord.MachineLearning (in Accord.MachineLearning.dll) Version: 3.8.0
Syntax
[SerializableAttribute]
public class NaiveBayes : NaiveBayes<GeneralDiscreteDistribution, int>
Request Example View Source

The NaiveBayes type exposes the following members.

Constructors
Properties
  NameDescription
Public propertyClassCount Obsolete.
Gets the number of possible output classes.
Public propertyDistributions
Gets the probability distributions for each class and input.
Public propertyInputCount Obsolete.
Gets the number of inputs in the model.
Public propertyNumberOfClasses
Gets the number of classes expected and recognized by the classifier.
(Inherited from ClassifierBaseTInput, TClasses.)
Public propertyNumberOfInputs
Gets the number of inputs accepted by the model.
(Inherited from TransformBaseTInput, TOutput.)
Public propertyNumberOfOutputs
Gets the number of outputs generated by the model.
(Inherited from TransformBaseTInput, TOutput.)
Public propertyNumberOfSymbols
Gets the number of symbols for each input in the model.
Public propertyPriors
Gets the prior beliefs for each class.
(Inherited from BayesTDistribution, TInput.)
Public propertySymbolCount Obsolete.
Gets the number of symbols for each input in the model.
Top
Methods
  NameDescription
Public methodCompute(Int32) Obsolete.
Computes the most likely class for a given instance.
Public methodCompute(Int32, Double, Double) Obsolete.
Computes the most likely class for a given instance.
Public methodDecide(TInput)
Computes class-label decisions for a given set of input vectors.
(Inherited from ClassifierBaseTInput, TClasses.)
Public methodDecide(TInput)
Computes a class-label decision for a given input.
(Inherited from MulticlassScoreClassifierBaseTInput.)
Public methodDecide(TInput, TClasses)
Computes a class-label decision for a given input.
(Inherited from ClassifierBaseTInput, TClasses.)
Public methodDecide(TInput, Boolean)
Computes class-label decisions for the given input.
(Inherited from MulticlassClassifierBaseTInput.)
Public methodDecide(TInput, Double)
Computes class-label decisions for the given input.
(Inherited from MulticlassClassifierBaseTInput.)
Public methodDecide(TInput, Int32)
Computes class-label decisions for the given input.
(Inherited from MulticlassClassifierBaseTInput.)
Public methodDecide(TInput, Double)
Computes a class-label decision for a given input.
(Inherited from MulticlassClassifierBaseTInput.)
Public methodEquals
Determines whether the specified object is equal to the current object.
(Inherited from Object.)
Public methodEstimate Obsolete.
Obsolete.
Protected methodFinalize
Allows an object to try to free resources and perform other cleanup operations before it is reclaimed by garbage collection.
(Inherited from Object.)
Public methodGetHashCode
Serves as the default hash function.
(Inherited from Object.)
Public methodGetType
Gets the Type of the current instance.
(Inherited from Object.)
Public methodStatic memberLoad(Stream) Obsolete.
Loads a machine from a stream.
Public methodStatic memberLoad(String) Obsolete.
Loads a machine from a file.
Public methodStatic memberLoadTDistribution(Stream) Obsolete.
Loads a machine from a stream.
Public methodStatic memberLoadTDistribution(String) Obsolete.
Loads a machine from a file.
Public methodLogLikelihood(TInput)
Computes the log-likelihood that the given input vector belongs to its most plausible class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodLogLikelihood(TInput)
Computes the log-likelihood that the given input vectors belongs to each of the possible classes.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodLogLikelihood(TInput, Int32)
Predicts a class label vector for the given input vector, returning the log-likelihood that the input vector belongs to its predicted class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodLogLikelihood(TInput, Double)
Computes the log-likelihood that the given input vectors belongs to each of the possible classes.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodLogLikelihood(TInput, Int32)
Computes the log-likelihood that the given input vector belongs to the specified classIndex.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodLogLikelihood(TInput, Int32)
Computes the log-likelihood that the given input vector belongs to the specified classIndex.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodLogLikelihood(TInput, Int32)
Predicts a class label for each input vector, returning the log-likelihood that each vector belongs to its predicted class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodLogLikelihood(TInput, Int32)
Computes the log-likelihood that the given input vector belongs to the specified classIndex.
(Inherited from BayesTDistribution, TInput.)
Public methodLogLikelihood(TInput, Int32, Double)
Computes the log-likelihood that the given input vector belongs to the specified classIndex.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodLogLikelihood(TInput, Int32, Double)
Computes the log-likelihood that the given input vector belongs to the specified classIndex.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodLogLikelihood(TInput, Int32, Double)
Predicts a class label for each input vector, returning the log-likelihood that each vector belongs to its predicted class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodLogLikelihoods(TInput)
Computes the log-likelihood that the given input vector belongs to each of the possible classes.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodLogLikelihoods(TInput)
Computes the log-likelihood that the given input vectors belongs to each of the possible classes.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodLogLikelihoods(TInput, Double)
Computes the log-likelihood that the given input vector belongs to each of the possible classes.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodLogLikelihoods(TInput, Int32)
Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodLogLikelihoods(TInput, Double)
Computes the log-likelihood that the given input vector belongs to each of the possible classes.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodLogLikelihoods(TInput, Int32)
Computes the log-likelihood that the given input vector belongs to the specified classIndex.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodLogLikelihoods(TInput, Int32)
Predicts a class label vector for each input vector, returning the log-likelihoods of the input vector belonging to each possible class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodLogLikelihoods(TInput, Int32, Double)
Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodLogLikelihoods(TInput, Int32, Double)
Computes the log-likelihood that the given input vector belongs to the specified classIndex.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodLogLikelihoods(TInput, Int32, Double)
Predicts a class label vector for each input vector, returning the log-likelihoods of the input vector belonging to each possible class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Protected methodMemberwiseClone
Creates a shallow copy of the current Object.
(Inherited from Object.)
Public methodStatic memberNormal(Int32, Int32)
Constructs a new Naïve Bayes Classifier.
Public methodStatic memberNormal(Int32, Int32, NormalDistribution)
Constructs a new Naïve Bayes Classifier.
Public methodStatic memberNormal(Int32, Int32, NormalDistribution)
Constructs a new Naïve Bayes Classifier.
Public methodStatic memberNormal(Int32, Int32, NormalDistribution)
Constructs a new Naïve Bayes Classifier.
Public methodStatic memberNormal(Int32, Int32, Double)
Constructs a new Naïve Bayes Classifier.
Public methodStatic memberNormal(Int32, Int32, NormalDistribution, Double)
Constructs a new Naïve Bayes Classifier.
Public methodProbabilities(TInput)
Computes the probabilities that the given input vector belongs to each of the possible classes.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodProbabilities(TInput)
Computes the probabilities that the given input vector belongs to each of the possible classes.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodProbabilities(TInput, Double)
Computes the probabilities that the given input vector belongs to each of the possible classes.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodProbabilities(TInput, Int32)
Predicts a class label vector for the given input vector, returning the probabilities of the input vector belonging to each possible class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodProbabilities(TInput, Double)
Computes the probabilities that the given input vector belongs to each of the possible classes.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodProbabilities(TInput, Int32)
Predicts a class label vector for each input vector, returning the probabilities of the input vector belonging to each possible class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodProbabilities(TInput, Int32, Double)
Predicts a class label vector for the given input vector, returning the probabilities of the input vector belonging to each possible class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodProbabilities(TInput, Int32, Double)
Predicts a class label vector for each input vector, returning the probabilities of the input vector belonging to each possible class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodProbability(TInput)
Predicts a class label for the given input vector, returning the probability that the input vector belongs to its predicted class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodProbability(TInput)
Predicts a class label for the given input vector, returning the probability that the input vector belongs to its predicted class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodProbability(TInput, Int32)
Computes the probability that the given input vector belongs to the specified classIndex.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodProbability(TInput, Int32)
Predicts a class label for the given input vector, returning the probability that the input vector belongs to its predicted class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodProbability(TInput, Double)
Predicts a class label for the given input vector, returning the probability that the input vector belongs to its predicted class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodProbability(TInput, Int32)
Computes the probability that the given input vector belongs to the specified classIndex.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodProbability(TInput, Int32)
Computes the probability that the given input vector belongs to the specified classIndex.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodProbability(TInput, Int32)
Predicts a class label for each input vector, returning the probability that each vector belongs to its predicted class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodProbability(TInput, Int32, Double)
Computes the probability that the given input vector belongs to the specified classIndex.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodProbability(TInput, Int32, Double)
Computes the probability that the given input vector belongs to the specified classIndex.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodProbability(TInput, Int32, Double)
Predicts a class label for each input vector, returning the probability that each vector belongs to its predicted class.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodSave(Stream) Obsolete.
Saves the Naïve Bayes model to a stream.
Public methodSave(String) Obsolete.
Saves the Naïve Bayes model to a stream.
Public methodScore(TInput)
Computes a numerical score measuring the association between the given input vector and its most strongly associated class (as predicted by the classifier).
(Inherited from MulticlassScoreClassifierBaseTInput.)
Public methodScore(TInput)
Computes a numerical score measuring the association between the given input vector and its most strongly associated class (as predicted by the classifier).
(Inherited from MulticlassScoreClassifierBaseTInput.)
Public methodScore(TInput, Int32)
Computes a numerical score measuring the association between the given input vector and a given classIndex.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodScore(TInput, Int32)
Predicts a class label for the input vector, returning a numerical score measuring the strength of association of the input vector to its most strongly related class.
(Inherited from MulticlassScoreClassifierBaseTInput.)
Public methodScore(TInput, Double)
Computes a numerical score measuring the association between the given input vector and its most strongly associated class (as predicted by the classifier).
(Inherited from MulticlassScoreClassifierBaseTInput.)
Public methodScore(TInput, Int32)
Computes a numerical score measuring the association between the given input vector and a given classIndex.
(Inherited from MulticlassScoreClassifierBaseTInput.)
Public methodScore(TInput, Int32)
Computes a numerical score measuring the association between the given input vector and a given classIndex.
(Inherited from MulticlassScoreClassifierBaseTInput.)
Public methodScore(TInput, Int32)
Predicts a class label for each input vector, returning a numerical score measuring the strength of association of the input vector to the most strongly related class.
(Inherited from MulticlassScoreClassifierBaseTInput.)
Public methodScore(TInput, Int32, Double)
Computes a numerical score measuring the association between the given input vector and a given classIndex.
(Inherited from MulticlassScoreClassifierBaseTInput.)
Public methodScore(TInput, Int32, Double)
Computes a numerical score measuring the association between the given input vector and a given classIndex.
(Inherited from MulticlassScoreClassifierBaseTInput.)
Public methodScore(TInput, Int32, Double)
Predicts a class label for each input vector, returning a numerical score measuring the strength of association of the input vector to the most strongly related class.
(Inherited from MulticlassScoreClassifierBaseTInput.)
Public methodScores(TInput)
Computes a numerical score measuring the association between the given input vector and each class.
(Inherited from MulticlassScoreClassifierBaseTInput.)
Public methodScores(TInput)
Computes a numerical score measuring the association between the given input vector and each class.
(Inherited from MulticlassScoreClassifierBaseTInput.)
Public methodScores(TInput, Double)
Computes a numerical score measuring the association between the given input vector and each class.
(Inherited from MulticlassScoreClassifierBaseTInput.)
Public methodScores(TInput, Int32)
Predicts a class label vector for the given input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes.
(Inherited from MulticlassScoreClassifierBaseTInput.)
Public methodScores(TInput, Double)
Computes a numerical score measuring the association between the given input vector and each class.
(Inherited from MulticlassScoreClassifierBaseTInput.)
Public methodScores(TInput, Int32)
Predicts a class label vector for each input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes.
(Inherited from MulticlassScoreClassifierBaseTInput.)
Public methodScores(TInput, Int32, Double)
Predicts a class label vector for the given input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes.
(Inherited from MulticlassScoreClassifierBaseTInput.)
Public methodScores(TInput, Int32, Double)
Predicts a class label vector for each input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes.
(Inherited from MulticlassScoreClassifierBaseTInput.)
Public methodToMulticlass
Views this instance as a multi-class generative classifier.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodToMultilabel
Views this instance as a multi-label generative classifier, giving access to more advanced methods, such as the prediction of one-hot vectors.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodToString
Returns a string that represents the current object.
(Inherited from Object.)
Public methodTransform(TInput)
Applies the transformation to an input, producing an associated output.
(Inherited from ClassifierBaseTInput, TClasses.)
Public methodTransform(TInput)
Applies the transformation to a set of input vectors, producing an associated set of output vectors.
(Inherited from TransformBaseTInput, TOutput.)
Public methodTransform(TInput, TClasses)
Applies the transformation to an input, producing an associated output.
(Inherited from ClassifierBaseTInput, TClasses.)
Public methodTransform(TInput, Boolean)
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBaseTInput.)
Public methodTransform(TInput, Int32)
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBaseTInput.)
Public methodTransform(TInput, Boolean)
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBaseTInput.)
Public methodTransform(TInput, Double)
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBaseTInput.)
Public methodTransform(TInput, Int32)
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBaseTInput.)
Public methodTransform(TInput, Double)
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Public methodTransform(TInput, Double)
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassLikelihoodClassifierBaseTInput.)
Top
Extension Methods
  NameDescription
Public Extension MethodHasMethod
Checks whether an object implements a method with the given name.
(Defined by ExtensionMethods.)
Public Extension MethodIsEqual
Compares two objects for equality, performing an elementwise comparison if the elements are vectors or matrices.
(Defined by Matrix.)
Public Extension MethodTo(Type)Overloaded.
Converts an object into another type, irrespective of whether the conversion can be done at compile time or not. This can be used to convert generic types to numeric types during runtime.
(Defined by ExtensionMethods.)
Public Extension MethodToTOverloaded.
Converts an object into another type, irrespective of whether the conversion can be done at compile time or not. This can be used to convert generic types to numeric types during runtime.
(Defined by ExtensionMethods.)
Top
Remarks

A naive Bayes classifier is a simple probabilistic classifier based on applying Bayes' theorem with strong (naive) independence assumptions. A more descriptive term for the underlying probability model would be "independent feature model".

In simple terms, a naive Bayes classifier assumes that the presence (or absence) of a particular feature of a class is unrelated to the presence (or absence) of any other feature, given the class variable. In spite of their naive design and apparently over-simplified assumptions, naive Bayes classifiers have worked quite well in many complex real-world situations.

This class implements a discrete (integer-valued) Naive-Bayes classifier. There is also a special named constructor to create classifiers assuming normal distributions for each variable. For arbitrary distribution classifiers, please see NaiveBayesTDistribution.

References:

  • Wikipedia contributors. "Naive Bayes classifier." Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, 16 Dec. 2011. Web. 5 Jan. 2012.

Examples

In this example, we will be using the famous Play Tennis example by Tom Mitchell (1998). In Mitchell's example, one would like to infer if a person would play tennis or not based solely on four input variables. Those variables are all categorical, meaning that there is no order between the possible values for the variable (i.e. there is no order relationship between Sunny and Rain, one is not bigger nor smaller than the other, but are just distinct). Moreover, the rows, or instances presented below represent days on which the behavior of the person has been registered and annotated, pretty much building our set of observation instances for learning:

DataTable data = new DataTable("Mitchell's Tennis Example");

data.Columns.Add("Day", "Outlook", "Temperature", "Humidity", "Wind", "PlayTennis");

data.Rows.Add("D1", "Sunny", "Hot", "High", "Weak", "No");
data.Rows.Add("D2", "Sunny", "Hot", "High", "Strong", "No");
data.Rows.Add("D3", "Overcast", "Hot", "High", "Weak", "Yes");
data.Rows.Add("D4", "Rain", "Mild", "High", "Weak", "Yes");
data.Rows.Add("D5", "Rain", "Cool", "Normal", "Weak", "Yes");
data.Rows.Add("D6", "Rain", "Cool", "Normal", "Strong", "No");
data.Rows.Add("D7", "Overcast", "Cool", "Normal", "Strong", "Yes");
data.Rows.Add("D8", "Sunny", "Mild", "High", "Weak", "No");
data.Rows.Add("D9", "Sunny", "Cool", "Normal", "Weak", "Yes");
data.Rows.Add("D10", "Rain", "Mild", "Normal", "Weak", "Yes");
data.Rows.Add("D11", "Sunny", "Mild", "Normal", "Strong", "Yes");
data.Rows.Add("D12", "Overcast", "Mild", "High", "Strong", "Yes");
data.Rows.Add("D13", "Overcast", "Hot", "Normal", "Weak", "Yes");
data.Rows.Add("D14", "Rain", "Mild", "High", "Strong", "No");

Obs: The DataTable representation is not required, and instead the NaiveBayes could also be trained directly on integer arrays containing the integer codewords.

In order to estimate a discrete Naive Bayes, we will first convert this problem to a more simpler representation. Since all variables are categories, it does not matter if they are represented as strings, or numbers, since both are just symbols for the event they represent. Since numbers are more easily representable than text strings, we will convert the problem to use a discrete alphabet through the use of a codebook.

A codebook effectively transforms any distinct possible value for a variable into an integer symbol. For example, “Sunny” could as well be represented by the integer label 0, “Overcast” by “1”, Rain by “2”, and the same goes by for the other variables. So:

// Create a new codification codebook to
// convert strings into discrete symbols
Codification codebook = new Codification(data,
    "Outlook", "Temperature", "Humidity", "Wind", "PlayTennis");

// Extract input and output pairs to train
DataTable symbols = codebook.Apply(data);
int[][] inputs = symbols.ToArray<int>("Outlook", "Temperature", "Humidity", "Wind");
int[] outputs = symbols.ToArray<int>("PlayTennis");

Now that we already have our learning input/output pairs, we should specify our Bayes model. We will be trying to build a model to predict the last column, entitled “PlayTennis”. For this, we will be using the “Outlook”, “Temperature”, “Humidity” and “Wind” as predictors (variables which will we will use for our decision). Since those are categorical, we must specify, at the moment of creation of our Bayes model, the number of each possible symbols for those variables.

// Create a new Naive Bayes learning
var learner = new NaiveBayesLearning();

// Learn a Naive Bayes model from the examples
NaiveBayes nb = learner.Learn(inputs, outputs);

Now that we have created and estimated our classifier, we can query the classifier for new input samples through the Decide method.

// Consider we would like to know whether one should play tennis at a
// sunny, cool, humid and windy day. Let us first encode this instance
int[] instance = codebook.Translate("Sunny", "Cool", "High", "Strong");

// Let us obtain the numeric output that represents the answer
int c = nb.Decide(instance); // answer will be 0

// Now let us convert the numeric output to an actual "Yes" or "No" answer
string result = codebook.Translate("PlayTennis", c); // answer will be "No"

// We can also extract the probabilities for each possible answer
double[] probs = nb.Probabilities(instance); // { 0.795, 0.205 }

Please note that, while the example uses a DataTable to exemplify how data stored into tables can be loaded in the framework, it is not necessary at all to use DataTables in your own, final code. For example, please consider the same example shown above, but without DataTables:

string[] columnNames = { "Outlook", "Temperature", "Humidity", "Wind", "PlayTennis" };

string[][] data =
{
    new string[] { "Sunny", "Hot", "High", "Weak", "No" },
    new string[] { "Sunny", "Hot", "High", "Strong", "No" },
    new string[] { "Overcast", "Hot", "High", "Weak", "Yes" },
    new string[] { "Rain", "Mild", "High", "Weak", "Yes" },
    new string[] { "Rain", "Cool", "Normal", "Weak", "Yes" },
    new string[] { "Rain", "Cool", "Normal", "Strong", "No" },
    new string[] { "Overcast", "Cool", "Normal", "Strong", "Yes" },
    new string[] { "Sunny", "Mild", "High", "Weak", "No" },
    new string[] { "Sunny", "Cool", "Normal", "Weak", "Yes" },
    new string[] {  "Rain", "Mild", "Normal", "Weak", "Yes" },
    new string[] {  "Sunny", "Mild", "Normal", "Strong", "Yes" },
    new string[] {  "Overcast", "Mild", "High", "Strong", "Yes" },
    new string[] {  "Overcast", "Hot", "Normal", "Weak", "Yes" },
    new string[] {  "Rain", "Mild", "High", "Strong", "No" },
};

// Create a new codification codebook to
// convert strings into discrete symbols
Codification codebook = new Codification(columnNames, data);

// Extract input and output pairs to train
int[][] symbols = codebook.Transform(data);
int[][] inputs = symbols.Get(null, 0, -1); // Gets all rows, from 0 to the last (but not the last)
int[] outputs = symbols.GetColumn(-1);     // Gets only the last column

// Create a new Naive Bayes learning
var learner = new NaiveBayesLearning();

NaiveBayes nb = learner.Learn(inputs, outputs);

// Consider we would like to know whether one should play tennis at a
// sunny, cool, humid and windy day. Let us first encode this instance
int[] instance = codebook.Translate("Sunny", "Cool", "High", "Strong");

// Let us obtain the numeric output that represents the answer
int c = nb.Decide(instance); // answer will be 0

// Now let us convert the numeric output to an actual "Yes" or "No" answer
string result = codebook.Translate("PlayTennis", c); // answer will be "No"

// We can also extract the probabilities for each possible answer
double[] probs = nb.Probabilities(instance); // { 0.795, 0.205 }

In this second example, we will be creating a simple multi-class classification problem using integer vectors and learning a discrete Naive Bayes on those vectors.

// Let's say we have the following data to be classified
// into three possible classes. Those are the samples:
// 
int[][] inputs =
{
    //               input      output
    new int[] { 0, 1, 1, 0 }, //  0 
    new int[] { 0, 1, 0, 0 }, //  0
    new int[] { 0, 0, 1, 0 }, //  0
    new int[] { 0, 1, 1, 0 }, //  0
    new int[] { 0, 1, 0, 0 }, //  0
    new int[] { 1, 0, 0, 0 }, //  1
    new int[] { 1, 0, 0, 0 }, //  1
    new int[] { 1, 0, 0, 1 }, //  1
    new int[] { 0, 0, 0, 1 }, //  1
    new int[] { 0, 0, 0, 1 }, //  1
    new int[] { 1, 1, 1, 1 }, //  2
    new int[] { 1, 0, 1, 1 }, //  2
    new int[] { 1, 1, 0, 1 }, //  2
    new int[] { 0, 1, 1, 1 }, //  2
    new int[] { 1, 1, 1, 1 }, //  2
};

int[] outputs = // those are the class labels
{
    0, 0, 0, 0, 0,
    1, 1, 1, 1, 1,
    2, 2, 2, 2, 2,
};

// Let us create a learning algorithm
var learner = new NaiveBayesLearning();

// and teach a model on the data examples
NaiveBayes nb = learner.Learn(inputs, outputs);

// Now, let's test  the model output for the first input sample:
int answer = nb.Decide(new int[] { 0, 1, 1, 0 }); // should be 1

Like all other learning algorithms in the framework, it is also possible to obtain a better measure of the performance of the Naive Bayes algorithm using cross-validation, as shown in the example below:

// Ensure we have reproducible results
Accord.Math.Random.Generator.Seed = 0;

// Let's say we have the following data to be classified
// into three possible classes. Those are the samples:
// 
int[][] inputs =
{
    //               input      output
    new int[] { 0, 1, 1, 0 }, //  0 
    new int[] { 0, 1, 0, 0 }, //  0
    new int[] { 0, 0, 1, 0 }, //  0
    new int[] { 0, 1, 1, 0 }, //  0
    new int[] { 0, 1, 0, 0 }, //  0
    new int[] { 1, 0, 0, 0 }, //  1
    new int[] { 1, 0, 0, 0 }, //  1
    new int[] { 1, 0, 0, 1 }, //  1
    new int[] { 0, 0, 0, 1 }, //  1
    new int[] { 0, 0, 0, 1 }, //  1
    new int[] { 1, 1, 1, 1 }, //  2
    new int[] { 1, 0, 1, 1 }, //  2
    new int[] { 1, 1, 0, 1 }, //  2
    new int[] { 0, 1, 1, 1 }, //  2
    new int[] { 1, 1, 1, 1 }, //  2
};

int[] outputs = // those are the class labels
{
    0, 0, 0, 0, 0,
    1, 1, 1, 1, 1,
    2, 2, 2, 2, 2,
};

// Let's say we want to measure the cross-validation 
// performance of Naive Bayes on the above data set:
var cv = CrossValidation.Create(

    k: 10, // We will be using 10-fold cross validation

    // First we define the learning algorithm:
    learner: (p) => new NaiveBayesLearning(),

    // Now we have to specify how the n.b. performance should be measured:
    loss: (actual, expected, p) => new ZeroOneLoss(expected).Loss(actual),

    // This function can be used to perform any special
    // operations before the actual learning is done, but
    // here we will just leave it as simple as it can be:
    fit: (teacher, x, y, w) => teacher.Learn(x, y, w),

    // Finally, we have to pass the input and output data
    // that will be used in cross-validation. 
    x: inputs, y: outputs
);

// After the cross-validation object has been created,
// we can call its .Learn method with the input and 
// output data that will be partitioned into the folds:
var result = cv.Learn(inputs, outputs);

// We can grab some information about the problem:
int numberOfSamples = result.NumberOfSamples; // should be 15
int numberOfInputs = result.NumberOfInputs;   // should be 4
int numberOfOutputs = result.NumberOfOutputs; // should be 3

double trainingError = result.Training.Mean; // should be 0
double validationError = result.Validation.Mean; // should be 0.15 (+/- var. 0.11388888888888887)

// If desired, compute an aggregate confusion matrix for the validation sets:
GeneralConfusionMatrix gcm = result.ToConfusionMatrix(inputs, outputs);
See Also