NaiveBayes Class

Naïve Bayes Classifier.

Inheritance Hierarchy

Namespace: Accord.MachineLearning.Bayes
Assembly: Accord.MachineLearning (in Accord.MachineLearning.dll) Version: 3.8.0

Syntax

Copy

[SerializableAttribute]
public class NaiveBayes : NaiveBayes<GeneralDiscreteDistribution, int>

<SerializableAttribute>
Public Class NaiveBayes
	Inherits NaiveBayes(Of GeneralDiscreteDistribution, Integer)

Request Example View Source

The NaiveBayes type exposes the following members.

Constructors

	Name	Description
	NaiveBayes(Int32, Int32)	Constructs a new Naïve Bayes Classifier.
	NaiveBayes(Int32, Double, Int32)	Obsolete. Obsolete.

Top

Properties

	Name	Description
	ClassCount	Obsolete. Gets the number of possible output classes.
	Distributions	Gets the probability distributions for each class and input.
	InputCount	Obsolete. Gets the number of inputs in the model.
	NumberOfClasses	Gets the number of classes expected and recognized by the classifier. (Inherited from ClassifierBaseTInput, TClasses.)
	NumberOfInputs	Gets the number of inputs accepted by the model. (Inherited from TransformBaseTInput, TOutput.)
	NumberOfOutputs	Gets the number of outputs generated by the model. (Inherited from TransformBaseTInput, TOutput.)
	NumberOfSymbols	Gets the number of symbols for each input in the model.
	Priors	Gets the prior beliefs for each class. (Inherited from BayesTDistribution, TInput.)
	SymbolCount	Obsolete. Gets the number of symbols for each input in the model.

Top

Methods

	Name	Description
	Compute(Int32)	Obsolete. Computes the most likely class for a given instance.
	Compute(Int32, Double, Double)	Obsolete. Computes the most likely class for a given instance.
	Decide(TInput)	Computes class-label decisions for a given set of input vectors. (Inherited from ClassifierBaseTInput, TClasses.)
	Decide(TInput)	Computes a class-label decision for a given input. (Inherited from MulticlassScoreClassifierBaseTInput.)
	Decide(TInput, TClasses)	Computes a class-label decision for a given input. (Inherited from ClassifierBaseTInput, TClasses.)
	Decide(TInput, Boolean)	Computes class-label decisions for the given input. (Inherited from MulticlassClassifierBaseTInput.)
	Decide(TInput, Double)	Computes class-label decisions for the given input. (Inherited from MulticlassClassifierBaseTInput.)
	Decide(TInput, Int32)	Computes class-label decisions for the given input. (Inherited from MulticlassClassifierBaseTInput.)
	Decide(TInput, Double)	Computes a class-label decision for a given input. (Inherited from MulticlassClassifierBaseTInput.)
	Equals	Determines whether the specified object is equal to the current object. (Inherited from Object.)
	Estimate	Obsolete. Obsolete.
	Finalize	Allows an object to try to free resources and perform other cleanup operations before it is reclaimed by garbage collection. (Inherited from Object.)
	GetHashCode	Serves as the default hash function. (Inherited from Object.)
	GetType	Gets the Type of the current instance. (Inherited from Object.)
	Load(Stream)	Obsolete. Loads a machine from a stream.
	Load(String)	Obsolete. Loads a machine from a file.
	LoadTDistribution(Stream)	Obsolete. Loads a machine from a stream.
	LoadTDistribution(String)	Obsolete. Loads a machine from a file.
	LogLikelihood(TInput)	Computes the log-likelihood that the given input vector belongs to its most plausible class. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	LogLikelihood(TInput)	Computes the log-likelihood that the given input vectors belongs to each of the possible classes. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	LogLikelihood(TInput, Int32)	Predicts a class label vector for the given input vector, returning the log-likelihood that the input vector belongs to its predicted class. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	LogLikelihood(TInput, Double)	Computes the log-likelihood that the given input vectors belongs to each of the possible classes. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	LogLikelihood(TInput, Int32)	Computes the log-likelihood that the given input vector belongs to the specified classIndex. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	LogLikelihood(TInput, Int32)	Computes the log-likelihood that the given input vector belongs to the specified classIndex. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	LogLikelihood(TInput, Int32)	Predicts a class label for each input vector, returning the log-likelihood that each vector belongs to its predicted class. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	LogLikelihood(TInput, Int32)	Computes the log-likelihood that the given input vector belongs to the specified classIndex. (Inherited from BayesTDistribution, TInput.)
	LogLikelihood(TInput, Int32, Double)	Computes the log-likelihood that the given input vector belongs to the specified classIndex. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	LogLikelihood(TInput, Int32, Double)	Computes the log-likelihood that the given input vector belongs to the specified classIndex. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	LogLikelihood(TInput, Int32, Double)	Predicts a class label for each input vector, returning the log-likelihood that each vector belongs to its predicted class. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	LogLikelihoods(TInput)	Computes the log-likelihood that the given input vector belongs to each of the possible classes. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	LogLikelihoods(TInput)	Computes the log-likelihood that the given input vectors belongs to each of the possible classes. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	LogLikelihoods(TInput, Double)	Computes the log-likelihood that the given input vector belongs to each of the possible classes. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	LogLikelihoods(TInput, Int32)	Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	LogLikelihoods(TInput, Double)	Computes the log-likelihood that the given input vector belongs to each of the possible classes. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	LogLikelihoods(TInput, Int32)	Computes the log-likelihood that the given input vector belongs to the specified classIndex. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	LogLikelihoods(TInput, Int32)	Predicts a class label vector for each input vector, returning the log-likelihoods of the input vector belonging to each possible class. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	LogLikelihoods(TInput, Int32, Double)	Predicts a class label vector for the given input vector, returning the log-likelihoods of the input vector belonging to each possible class. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	LogLikelihoods(TInput, Int32, Double)	Computes the log-likelihood that the given input vector belongs to the specified classIndex. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	LogLikelihoods(TInput, Int32, Double)	Predicts a class label vector for each input vector, returning the log-likelihoods of the input vector belonging to each possible class. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	MemberwiseClone	Creates a shallow copy of the current Object. (Inherited from Object.)
	Normal(Int32, Int32)	Constructs a new Naïve Bayes Classifier.
	Normal(Int32, Int32, NormalDistribution)	Constructs a new Naïve Bayes Classifier.
	Normal(Int32, Int32, NormalDistribution)	Constructs a new Naïve Bayes Classifier.
	Normal(Int32, Int32, NormalDistribution)	Constructs a new Naïve Bayes Classifier.
	Normal(Int32, Int32, Double)	Constructs a new Naïve Bayes Classifier.
	Normal(Int32, Int32, NormalDistribution, Double)	Constructs a new Naïve Bayes Classifier.
	Probabilities(TInput)	Computes the probabilities that the given input vector belongs to each of the possible classes. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	Probabilities(TInput)	Computes the probabilities that the given input vector belongs to each of the possible classes. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	Probabilities(TInput, Double)	Computes the probabilities that the given input vector belongs to each of the possible classes. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	Probabilities(TInput, Int32)	Predicts a class label vector for the given input vector, returning the probabilities of the input vector belonging to each possible class. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	Probabilities(TInput, Double)	Computes the probabilities that the given input vector belongs to each of the possible classes. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	Probabilities(TInput, Int32)	Predicts a class label vector for each input vector, returning the probabilities of the input vector belonging to each possible class. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	Probabilities(TInput, Int32, Double)	Predicts a class label vector for the given input vector, returning the probabilities of the input vector belonging to each possible class. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	Probabilities(TInput, Int32, Double)	Predicts a class label vector for each input vector, returning the probabilities of the input vector belonging to each possible class. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	Probability(TInput)	Predicts a class label for the given input vector, returning the probability that the input vector belongs to its predicted class. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	Probability(TInput)	Predicts a class label for the given input vector, returning the probability that the input vector belongs to its predicted class. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	Probability(TInput, Int32)	Computes the probability that the given input vector belongs to the specified classIndex. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	Probability(TInput, Int32)	Predicts a class label for the given input vector, returning the probability that the input vector belongs to its predicted class. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	Probability(TInput, Double)	Predicts a class label for the given input vector, returning the probability that the input vector belongs to its predicted class. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	Probability(TInput, Int32)	Computes the probability that the given input vector belongs to the specified classIndex. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	Probability(TInput, Int32)	Computes the probability that the given input vector belongs to the specified classIndex. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	Probability(TInput, Int32)	Predicts a class label for each input vector, returning the probability that each vector belongs to its predicted class. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	Probability(TInput, Int32, Double)	Computes the probability that the given input vector belongs to the specified classIndex. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	Probability(TInput, Int32, Double)	Computes the probability that the given input vector belongs to the specified classIndex. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	Probability(TInput, Int32, Double)	Predicts a class label for each input vector, returning the probability that each vector belongs to its predicted class. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	Save(Stream)	Obsolete. Saves the Naïve Bayes model to a stream.
	Save(String)	Obsolete. Saves the Naïve Bayes model to a stream.
	Score(TInput)	Computes a numerical score measuring the association between the given input vector and its most strongly associated class (as predicted by the classifier). (Inherited from MulticlassScoreClassifierBaseTInput.)
	Score(TInput)	Computes a numerical score measuring the association between the given input vector and its most strongly associated class (as predicted by the classifier). (Inherited from MulticlassScoreClassifierBaseTInput.)
	Score(TInput, Int32)	Computes a numerical score measuring the association between the given input vector and a given classIndex. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	Score(TInput, Int32)	Predicts a class label for the input vector, returning a numerical score measuring the strength of association of the input vector to its most strongly related class. (Inherited from MulticlassScoreClassifierBaseTInput.)
	Score(TInput, Double)	Computes a numerical score measuring the association between the given input vector and its most strongly associated class (as predicted by the classifier). (Inherited from MulticlassScoreClassifierBaseTInput.)
	Score(TInput, Int32)	Computes a numerical score measuring the association between the given input vector and a given classIndex. (Inherited from MulticlassScoreClassifierBaseTInput.)
	Score(TInput, Int32)	Computes a numerical score measuring the association between the given input vector and a given classIndex. (Inherited from MulticlassScoreClassifierBaseTInput.)
	Score(TInput, Int32)	Predicts a class label for each input vector, returning a numerical score measuring the strength of association of the input vector to the most strongly related class. (Inherited from MulticlassScoreClassifierBaseTInput.)
	Score(TInput, Int32, Double)	Computes a numerical score measuring the association between the given input vector and a given classIndex. (Inherited from MulticlassScoreClassifierBaseTInput.)
	Score(TInput, Int32, Double)	Computes a numerical score measuring the association between the given input vector and a given classIndex. (Inherited from MulticlassScoreClassifierBaseTInput.)
	Score(TInput, Int32, Double)	Predicts a class label for each input vector, returning a numerical score measuring the strength of association of the input vector to the most strongly related class. (Inherited from MulticlassScoreClassifierBaseTInput.)
	Scores(TInput)	Computes a numerical score measuring the association between the given input vector and each class. (Inherited from MulticlassScoreClassifierBaseTInput.)
	Scores(TInput)	Computes a numerical score measuring the association between the given input vector and each class. (Inherited from MulticlassScoreClassifierBaseTInput.)
	Scores(TInput, Double)	Computes a numerical score measuring the association between the given input vector and each class. (Inherited from MulticlassScoreClassifierBaseTInput.)
	Scores(TInput, Int32)	Predicts a class label vector for the given input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. (Inherited from MulticlassScoreClassifierBaseTInput.)
	Scores(TInput, Double)	Computes a numerical score measuring the association between the given input vector and each class. (Inherited from MulticlassScoreClassifierBaseTInput.)
	Scores(TInput, Int32)	Predicts a class label vector for each input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. (Inherited from MulticlassScoreClassifierBaseTInput.)
	Scores(TInput, Int32, Double)	Predicts a class label vector for the given input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. (Inherited from MulticlassScoreClassifierBaseTInput.)
	Scores(TInput, Int32, Double)	Predicts a class label vector for each input vector, returning a numerical score measuring the strength of association of the input vector to each of the possible classes. (Inherited from MulticlassScoreClassifierBaseTInput.)
	ToMulticlass	Views this instance as a multi-class generative classifier. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	ToMultilabel	Views this instance as a multi-label generative classifier, giving access to more advanced methods, such as the prediction of one-hot vectors. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	ToString	Returns a string that represents the current object. (Inherited from Object.)
	Transform(TInput)	Applies the transformation to an input, producing an associated output. (Inherited from ClassifierBaseTInput, TClasses.)
	Transform(TInput)	Applies the transformation to a set of input vectors, producing an associated set of output vectors. (Inherited from TransformBaseTInput, TOutput.)
	Transform(TInput, TClasses)	Applies the transformation to an input, producing an associated output. (Inherited from ClassifierBaseTInput, TClasses.)
	Transform(TInput, Boolean)	Applies the transformation to an input, producing an associated output. (Inherited from MulticlassClassifierBaseTInput.)
	Transform(TInput, Int32)	Applies the transformation to an input, producing an associated output. (Inherited from MulticlassClassifierBaseTInput.)
	Transform(TInput, Boolean)	Applies the transformation to an input, producing an associated output. (Inherited from MulticlassClassifierBaseTInput.)
	Transform(TInput, Double)	Applies the transformation to an input, producing an associated output. (Inherited from MulticlassClassifierBaseTInput.)
	Transform(TInput, Int32)	Applies the transformation to an input, producing an associated output. (Inherited from MulticlassClassifierBaseTInput.)
	Transform(TInput, Double)	Applies the transformation to an input, producing an associated output. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)
	Transform(TInput, Double)	Applies the transformation to an input, producing an associated output. (Inherited from MulticlassLikelihoodClassifierBaseTInput.)

Top

Extension Methods

	Name	Description
	HasMethod	Checks whether an object implements a method with the given name. (Defined by ExtensionMethods.)
	IsEqual	Compares two objects for equality, performing an elementwise comparison if the elements are vectors or matrices. (Defined by Matrix.)
	To(Type)	Overloaded. Converts an object into another type, irrespective of whether the conversion can be done at compile time or not. This can be used to convert generic types to numeric types during runtime. (Defined by ExtensionMethods.)
	ToT	Overloaded. Converts an object into another type, irrespective of whether the conversion can be done at compile time or not. This can be used to convert generic types to numeric types during runtime. (Defined by ExtensionMethods.)

Top

Remarks

A naive Bayes classifier is a simple probabilistic classifier based on applying Bayes' theorem with strong (naive) independence assumptions. A more descriptive term for the underlying probability model would be "independent feature model".

In simple terms, a naive Bayes classifier assumes that the presence (or absence) of a particular feature of a class is unrelated to the presence (or absence) of any other feature, given the class variable. In spite of their naive design and apparently over-simplified assumptions, naive Bayes classifiers have worked quite well in many complex real-world situations.

This class implements a discrete (integer-valued) Naive-Bayes classifier. There is also a special named constructor to create classifiers assuming normal distributions for each variable. For arbitrary distribution classifiers, please see NaiveBayesTDistribution.

References:

Wikipedia contributors. "Naive Bayes classifier." Wikipedia, The Free Encyclopedia. Wikipedia, The Free Encyclopedia, 16 Dec. 2011. Web. 5 Jan. 2012.

Examples

In this example, we will be using the famous Play Tennis example by Tom Mitchell (1998). In Mitchell's example, one would like to infer if a person would play tennis or not based solely on four input variables. Those variables are all categorical, meaning that there is no order between the possible values for the variable (i.e. there is no order relationship between Sunny and Rain, one is not bigger nor smaller than the other, but are just distinct). Moreover, the rows, or instances presented below represent days on which the behavior of the person has been registered and annotated, pretty much building our set of observation instances for learning:

Copy

DataTable data = new DataTable("Mitchell's Tennis Example");

data.Columns.Add("Day", "Outlook", "Temperature", "Humidity", "Wind", "PlayTennis");

data.Rows.Add("D1", "Sunny", "Hot", "High", "Weak", "No");
data.Rows.Add("D2", "Sunny", "Hot", "High", "Strong", "No");
data.Rows.Add("D3", "Overcast", "Hot", "High", "Weak", "Yes");
data.Rows.Add("D4", "Rain", "Mild", "High", "Weak", "Yes");
data.Rows.Add("D5", "Rain", "Cool", "Normal", "Weak", "Yes");
data.Rows.Add("D6", "Rain", "Cool", "Normal", "Strong", "No");
data.Rows.Add("D7", "Overcast", "Cool", "Normal", "Strong", "Yes");
data.Rows.Add("D8", "Sunny", "Mild", "High", "Weak", "No");
data.Rows.Add("D9", "Sunny", "Cool", "Normal", "Weak", "Yes");
data.Rows.Add("D10", "Rain", "Mild", "Normal", "Weak", "Yes");
data.Rows.Add("D11", "Sunny", "Mild", "Normal", "Strong", "Yes");
data.Rows.Add("D12", "Overcast", "Mild", "High", "Strong", "Yes");
data.Rows.Add("D13", "Overcast", "Hot", "Normal", "Weak", "Yes");
data.Rows.Add("D14", "Rain", "Mild", "High", "Strong", "No");

Obs: The DataTable representation is not required, and instead the NaiveBayes could also be trained directly on integer arrays containing the integer codewords.

In order to estimate a discrete Naive Bayes, we will first convert this problem to a more simpler representation. Since all variables are categories, it does not matter if they are represented as strings, or numbers, since both are just symbols for the event they represent. Since numbers are more easily representable than text strings, we will convert the problem to use a discrete alphabet through the use of a codebook.

A codebook effectively transforms any distinct possible value for a variable into an integer symbol. For example, “Sunny” could as well be represented by the integer label 0, “Overcast” by “1”, Rain by “2”, and the same goes by for the other variables. So:

Copy

// Create a new codification codebook to
// convert strings into discrete symbols
Codification codebook = new Codification(data,
    "Outlook", "Temperature", "Humidity", "Wind", "PlayTennis");

// Extract input and output pairs to train
DataTable symbols = codebook.Apply(data);
int[][] inputs = symbols.ToArray<int>("Outlook", "Temperature", "Humidity", "Wind");
int[] outputs = symbols.ToArray<int>("PlayTennis");

Now that we already have our learning input/output pairs, we should specify our Bayes model. We will be trying to build a model to predict the last column, entitled “PlayTennis”. For this, we will be using the “Outlook”, “Temperature”, “Humidity” and “Wind” as predictors (variables which will we will use for our decision). Since those are categorical, we must specify, at the moment of creation of our Bayes model, the number of each possible symbols for those variables.

Copy

// Create a new Naive Bayes learning
var learner = new NaiveBayesLearning();

// Learn a Naive Bayes model from the examples
NaiveBayes nb = learner.Learn(inputs, outputs);

Now that we have created and estimated our classifier, we can query the classifier for new input samples through the Decide method.

Copy

// Consider we would like to know whether one should play tennis at a
// sunny, cool, humid and windy day. Let us first encode this instance
int[] instance = codebook.Translate("Sunny", "Cool", "High", "Strong");

// Let us obtain the numeric output that represents the answer
int c = nb.Decide(instance); // answer will be 0

// Now let us convert the numeric output to an actual "Yes" or "No" answer
string result = codebook.Translate("PlayTennis", c); // answer will be "No"

// We can also extract the probabilities for each possible answer
double[] probs = nb.Probabilities(instance); // { 0.795, 0.205 }

Please note that, while the example uses a DataTable to exemplify how data stored into tables can be loaded in the framework, it is not necessary at all to use DataTables in your own, final code. For example, please consider the same example shown above, but without DataTables:

Copy

string[] columnNames = { "Outlook", "Temperature", "Humidity", "Wind", "PlayTennis" };

string[][] data =
{
    new string[] { "Sunny", "Hot", "High", "Weak", "No" },
    new string[] { "Sunny", "Hot", "High", "Strong", "No" },
    new string[] { "Overcast", "Hot", "High", "Weak", "Yes" },
    new string[] { "Rain", "Mild", "High", "Weak", "Yes" },
    new string[] { "Rain", "Cool", "Normal", "Weak", "Yes" },
    new string[] { "Rain", "Cool", "Normal", "Strong", "No" },
    new string[] { "Overcast", "Cool", "Normal", "Strong", "Yes" },
    new string[] { "Sunny", "Mild", "High", "Weak", "No" },
    new string[] { "Sunny", "Cool", "Normal", "Weak", "Yes" },
    new string[] {  "Rain", "Mild", "Normal", "Weak", "Yes" },
    new string[] {  "Sunny", "Mild", "Normal", "Strong", "Yes" },
    new string[] {  "Overcast", "Mild", "High", "Strong", "Yes" },
    new string[] {  "Overcast", "Hot", "Normal", "Weak", "Yes" },
    new string[] {  "Rain", "Mild", "High", "Strong", "No" },
};

// Create a new codification codebook to
// convert strings into discrete symbols
Codification codebook = new Codification(columnNames, data);

// Extract input and output pairs to train
int[][] symbols = codebook.Transform(data);
int[][] inputs = symbols.Get(null, 0, -1); // Gets all rows, from 0 to the last (but not the last)
int[] outputs = symbols.GetColumn(-1);     // Gets only the last column

// Create a new Naive Bayes learning
var learner = new NaiveBayesLearning();

NaiveBayes nb = learner.Learn(inputs, outputs);

// Consider we would like to know whether one should play tennis at a
// sunny, cool, humid and windy day. Let us first encode this instance
int[] instance = codebook.Translate("Sunny", "Cool", "High", "Strong");

// Let us obtain the numeric output that represents the answer
int c = nb.Decide(instance); // answer will be 0

// Now let us convert the numeric output to an actual "Yes" or "No" answer
string result = codebook.Translate("PlayTennis", c); // answer will be "No"

// We can also extract the probabilities for each possible answer
double[] probs = nb.Probabilities(instance); // { 0.795, 0.205 }

In this second example, we will be creating a simple multi-class classification problem using integer vectors and learning a discrete Naive Bayes on those vectors.

Copy

// Let's say we have the following data to be classified
// into three possible classes. Those are the samples:
// 
int[][] inputs =
{
    //               input      output
    new int[] { 0, 1, 1, 0 }, //  0 
    new int[] { 0, 1, 0, 0 }, //  0
    new int[] { 0, 0, 1, 0 }, //  0
    new int[] { 0, 1, 1, 0 }, //  0
    new int[] { 0, 1, 0, 0 }, //  0
    new int[] { 1, 0, 0, 0 }, //  1
    new int[] { 1, 0, 0, 0 }, //  1
    new int[] { 1, 0, 0, 1 }, //  1
    new int[] { 0, 0, 0, 1 }, //  1
    new int[] { 0, 0, 0, 1 }, //  1
    new int[] { 1, 1, 1, 1 }, //  2
    new int[] { 1, 0, 1, 1 }, //  2
    new int[] { 1, 1, 0, 1 }, //  2
    new int[] { 0, 1, 1, 1 }, //  2
    new int[] { 1, 1, 1, 1 }, //  2
};

int[] outputs = // those are the class labels
{
    0, 0, 0, 0, 0,
    1, 1, 1, 1, 1,
    2, 2, 2, 2, 2,
};

// Let us create a learning algorithm
var learner = new NaiveBayesLearning();

// and teach a model on the data examples
NaiveBayes nb = learner.Learn(inputs, outputs);

// Now, let's test  the model output for the first input sample:
int answer = nb.Decide(new int[] { 0, 1, 1, 0 }); // should be 1

Like all other learning algorithms in the framework, it is also possible to obtain a better measure of the performance of the Naive Bayes algorithm using cross-validation, as shown in the example below:

Copy

// Ensure we have reproducible results
Accord.Math.Random.Generator.Seed = 0;

// Let's say we have the following data to be classified
// into three possible classes. Those are the samples:
// 
int[][] inputs =
{
    //               input      output
    new int[] { 0, 1, 1, 0 }, //  0 
    new int[] { 0, 1, 0, 0 }, //  0
    new int[] { 0, 0, 1, 0 }, //  0
    new int[] { 0, 1, 1, 0 }, //  0
    new int[] { 0, 1, 0, 0 }, //  0
    new int[] { 1, 0, 0, 0 }, //  1
    new int[] { 1, 0, 0, 0 }, //  1
    new int[] { 1, 0, 0, 1 }, //  1
    new int[] { 0, 0, 0, 1 }, //  1
    new int[] { 0, 0, 0, 1 }, //  1
    new int[] { 1, 1, 1, 1 }, //  2
    new int[] { 1, 0, 1, 1 }, //  2
    new int[] { 1, 1, 0, 1 }, //  2
    new int[] { 0, 1, 1, 1 }, //  2
    new int[] { 1, 1, 1, 1 }, //  2
};

int[] outputs = // those are the class labels
{
    0, 0, 0, 0, 0,
    1, 1, 1, 1, 1,
    2, 2, 2, 2, 2,
};

// Let's say we want to measure the cross-validation 
// performance of Naive Bayes on the above data set:
var cv = CrossValidation.Create(

    k: 10, // We will be using 10-fold cross validation

    // First we define the learning algorithm:
    learner: (p) => new NaiveBayesLearning(),

    // Now we have to specify how the n.b. performance should be measured:
    loss: (actual, expected, p) => new ZeroOneLoss(expected).Loss(actual),

    // This function can be used to perform any special
    // operations before the actual learning is done, but
    // here we will just leave it as simple as it can be:
    fit: (teacher, x, y, w) => teacher.Learn(x, y, w),

    // Finally, we have to pass the input and output data
    // that will be used in cross-validation. 
    x: inputs, y: outputs
);

// After the cross-validation object has been created,
// we can call its .Learn method with the input and 
// output data that will be partitioned into the folds:
var result = cv.Learn(inputs, outputs);

// We can grab some information about the problem:
int numberOfSamples = result.NumberOfSamples; // should be 15
int numberOfInputs = result.NumberOfInputs;   // should be 4
int numberOfOutputs = result.NumberOfOutputs; // should be 3

double trainingError = result.Training.Mean; // should be 0
double validationError = result.Validation.Mean; // should be 0.15 (+/- var. 0.11388888888888887)

// If desired, compute an aggregate confusion matrix for the validation sets:
GeneralConfusionMatrix gcm = result.ToConfusionMatrix(inputs, outputs);

Reference

Accord.MachineLearning.Bayes Namespace

Accord.MachineLearning.BayesNaiveBayesLearning

Accord.MachineLearning.BayesNaiveBayesTDistribution

Accord.MachineLearning.BayesNaiveBayesTDistribution, TInput