DecisionTree Class |
Namespace: Accord.MachineLearning.DecisionTrees
The DecisionTree type exposes the following members.
Name | Description | |
---|---|---|
DecisionTree |
Creates a new DecisionTree to process
the given inputs and the given
number of possible classes.
|
Name | Description | |
---|---|---|
Attributes |
Gets the collection of attributes processed by this tree.
| |
InputCount | Obsolete.
Deprecated. Please use the NumberOfInputs property.
| |
NumberOfClasses |
Gets the number of classes expected and recognized by the classifier.
(Inherited from ClassifierBaseTInput, TClasses.) | |
NumberOfInputs |
Gets the number of inputs accepted by the model.
(Inherited from TransformBaseTInput, TOutput.) | |
NumberOfOutputs |
Gets the number of outputs generated by the model.
(Inherited from TransformBaseTInput, TOutput.) | |
OutputClasses | Obsolete.
Deprecated. Please use the NumberOfOutputs property instead.
| |
Root |
Gets or sets the root node for this tree.
|
Name | Description | |
---|---|---|
Compute(Double) | Obsolete.
Deprecated. Please use the Decide() method instead.
| |
Compute(Int32) | Obsolete.
Deprecated. Please use the Decide() method instead.
| |
Compute(Int32) | Obsolete.
Deprecated. Please use the Decide() method instead.
| |
Compute(Double, DecisionNode) | Obsolete.
Deprecated. Please use the Decide() method instead.
| |
Decide(TInput) |
Computes class-label decisions for a given set of input vectors.
(Inherited from ClassifierBaseTInput, TClasses.) | |
Decide(Int32) |
Computes a class-label decision for a given input.
(Inherited from MulticlassClassifierBase.) | |
Decide(Single) |
Computes a class-label decision for a given input.
(Inherited from MulticlassClassifierBase.) | |
Decide(Single) |
Computes a class-label decision for a given input.
(Inherited from MulticlassClassifierBase.) | |
Decide(Double) |
Computes the tree decision for a given input.
(Overrides ClassifierBaseTInput, TClassesDecide(TInput).) | |
Decide(Int32) |
Computes the tree decision for a given input.
(Overrides MulticlassClassifierBaseDecide(Int32).) | |
Decide(NullableInt32) |
Computes the tree decision for a given input.
| |
Decide(NullableInt32) |
Computes the tree decision for a given input.
| |
Decide(TInput, TClasses) |
Computes a class-label decision for a given input.
(Inherited from ClassifierBaseTInput, TClasses.) | |
Decide(Int32, Boolean) |
Computes class-label decisions for the given input.
(Inherited from MulticlassClassifierBase.) | |
Decide(Int32, Double) |
Computes class-label decisions for the given input.
(Inherited from MulticlassClassifierBase.) | |
Decide(Int32, Int32) |
Computes class-label decisions for the given input.
(Inherited from MulticlassClassifierBase.) | |
Decide(Int32, Boolean) |
Computes a class-label decision for a given input.
(Inherited from MulticlassClassifierBase.) | |
Decide(Int32, Double) |
Computes a class-label decision for a given input.
(Inherited from MulticlassClassifierBase.) | |
Decide(Int32, Double) |
Computes a class-label decision for a given input.
(Inherited from MulticlassClassifierBase.) | |
Decide(Int32, Int32) |
Computes a class-label decision for a given input.
(Inherited from MulticlassClassifierBase.) | |
Decide(Int32, Int32) |
Computes a class-label decision for a given input.
(Inherited from MulticlassClassifierBase.) | |
Decide(Single, Boolean) |
Computes class-label decisions for the given input.
(Inherited from MulticlassClassifierBase.) | |
Decide(Single, Double) |
Computes class-label decisions for the given input.
(Inherited from MulticlassClassifierBase.) | |
Decide(Single, Int32) |
Computes class-label decisions for the given input.
(Inherited from MulticlassClassifierBase.) | |
Decide(Single, Boolean) |
Computes a class-label decision for a given input.
(Inherited from MulticlassClassifierBase.) | |
Decide(Single, Double) |
Computes a class-label decision for a given input.
(Inherited from MulticlassClassifierBase.) | |
Decide(Single, Double) |
Computes a class-label decision for a given input.
(Inherited from MulticlassClassifierBase.) | |
Decide(Single, Int32) |
Computes a class-label decision for a given input.
(Inherited from MulticlassClassifierBase.) | |
Decide(Single, Int32) |
Computes a class-label decision for a given input.
(Inherited from MulticlassClassifierBase.) | |
Decide(TInput, Boolean) |
Computes class-label decisions for the given input.
(Inherited from MulticlassClassifierBaseTInput.) | |
Decide(TInput, Double) |
Computes class-label decisions for the given input.
(Inherited from MulticlassClassifierBaseTInput.) | |
Decide(TInput, Int32) |
Computes class-label decisions for the given input.
(Inherited from MulticlassClassifierBaseTInput.) | |
Decide(TInput, Double) |
Computes a class-label decision for a given input.
(Inherited from MulticlassClassifierBaseTInput.) | |
Decide(Double, DecisionNode) |
Computes the tree decision for a given input.
| |
Decide(NullableInt32, DecisionNode) |
Computes the tree decision for a given input.
| |
Decide(NullableInt32, Int32) |
Computes the tree decision for a given input.
| |
Equals | Determines whether the specified object is equal to the current object. (Inherited from Object.) | |
Finalize | Allows an object to try to free resources and perform other cleanup operations before it is reclaimed by garbage collection. (Inherited from Object.) | |
GetEnumerator |
Returns an enumerator that iterates through the tree.
| |
GetHashCode | Serves as the default hash function. (Inherited from Object.) | |
GetHeight |
Computes the height of the tree, defined as the
greatest distance (in links) between the tree's
root node and its leaves.
| |
GetType | Gets the Type of the current instance. (Inherited from Object.) | |
Load(Stream) | Obsolete.
Obsolete. Please use LoadT(Stream, SerializerCompression).
| |
Load(String) | Obsolete.
Obsolete. Please use LoadT(String).
| |
MemberwiseClone | Creates a shallow copy of the current Object. (Inherited from Object.) | |
Save(Stream) | Obsolete.
Obsolete. Please use SaveT(T, Stream, SerializerCompression) (or use it as an extension method).
| |
Save(String) | Obsolete.
Obsolete. Please use SaveT(T, String) (or use it as an extension method).
| |
ToAssembly(String, String) |
Creates a .NET assembly (.dll) containing a static class of
the given name implementing the decision tree. The class will
contain a single static Compute method implementing the tree.
| |
ToAssembly(String, String, String) |
Creates a .NET assembly (.dll) containing a static class of
the given name implementing the decision tree. The class will
contain a single static Compute method implementing the tree.
| |
ToCode(String) |
Generates a C# class implementing the decision tree.
| |
ToCode(TextWriter, String) |
Generates a C# class implementing the decision tree.
| |
ToExpression |
Creates an Expression Tree representation
of this decision tree, which can in turn be compiled into code.
| |
ToMultilabel |
Views this instance as a multi-label classifier,
giving access to more advanced methods, such as the prediction
of one-hot vectors.
(Inherited from MulticlassClassifierBaseTInput.) | |
ToRules |
Transforms the tree into a set of decision rules.
| |
ToString | Returns a string that represents the current object. (Inherited from Object.) | |
Transform(TInput) |
Applies the transformation to an input, producing an associated output.
(Inherited from ClassifierBaseTInput, TClasses.) | |
Transform(Int32) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBase.) | |
Transform(Single) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBase.) | |
Transform(Single) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBase.) | |
Transform(TInput) |
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
(Inherited from TransformBaseTInput, TOutput.) | |
Transform(TInput, TClasses) |
Applies the transformation to an input, producing an associated output.
(Inherited from ClassifierBaseTInput, TClasses.) | |
Transform(Int32, Double) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBase.) | |
Transform(Int32, Int32) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBase.) | |
Transform(Int32, Boolean) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBase.) | |
Transform(Int32, Double) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBase.) | |
Transform(Int32, Double) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBase.) | |
Transform(Int32, Int32) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBase.) | |
Transform(Int32, Int32) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBase.) | |
Transform(Single, Boolean) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBase.) | |
Transform(Single, Double) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBase.) | |
Transform(Single, Int32) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBase.) | |
Transform(Single, Int32) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBase.) | |
Transform(Single, Boolean) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBase.) | |
Transform(Single, Double) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBase.) | |
Transform(Single, Double) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBase.) | |
Transform(Single, Int32) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBase.) | |
Transform(Single, Int32) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBase.) | |
Transform(TInput, Boolean) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBaseTInput.) | |
Transform(TInput, Double) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBaseTInput.) | |
Transform(TInput, Int32) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBaseTInput.) | |
Transform(TInput, Boolean) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBaseTInput.) | |
Transform(TInput, Double) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBaseTInput.) | |
Transform(TInput, Double) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBaseTInput.) | |
Transform(TInput, Int32) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBaseTInput.) | |
Traverse(DecisionTreeTraversalMethod) |
Traverse the tree using a tree
traversal method. Can be iterated with a foreach loop.
| |
Traverse(DecisionTreeTraversalMethod, DecisionNode) |
Traverse a subtree using a tree
traversal method. Can be iterated with a foreach loop.
|
Name | Description | |
---|---|---|
HasMethod |
Checks whether an object implements a method with the given name.
(Defined by ExtensionMethods.) | |
IsEqual |
Compares two objects for equality, performing an elementwise
comparison if the elements are vectors or matrices.
(Defined by Matrix.) | |
SetEqualsDecisionNode |
Compares two enumerables for set equality. Two
enumerables are set equal if they contain the
same elements, but not necessarily in the same
order.
(Defined by Matrix.) | |
To(Type) | Overloaded.
Converts an object into another type, irrespective of whether
the conversion can be done at compile time or not. This can be
used to convert generic types to numeric types during runtime.
(Defined by ExtensionMethods.) | |
ToT | Overloaded.
Converts an object into another type, irrespective of whether
the conversion can be done at compile time or not. This can be
used to convert generic types to numeric types during runtime.
(Defined by ExtensionMethods.) |
Represents a decision tree which can be compiled to code at run-time. For sample usage and example of learning, please see the documentation pages for the ID3 and C4.5 learning algorithms.
It is also possible to create random forests using the random forest learning algorithm.
This example shows the simplest way to induce a decision tree with discrete variables.
// In this example, we will learn a decision tree directly from integer // matrices that define the inputs and outputs of our learning problem. int[][] inputs = { new int[] { 0, 0 }, new int[] { 0, 1 }, new int[] { 1, 0 }, new int[] { 1, 1 }, }; int[] outputs = // xor between inputs[0] and inputs[1] { 0, 1, 1, 0 }; // Create an ID3 learning algorithm ID3Learning teacher = new ID3Learning(); // Learn a decision tree for the XOR problem var tree = teacher.Learn(inputs, outputs); // Compute the error in the learning double error = new ZeroOneLoss(outputs).Loss(tree.Decide(inputs)); // The tree can now be queried for new examples: int[] predicted = tree.Decide(inputs); // should be { 0, 1, 1, 0 }
This example shows a common textbook example, and how to induce a decision tree using a codebook to convert string (text) variables into discrete symbols.
// In this example, we will be using the famous Play Tennis example by Tom Mitchell (1998). // In Mitchell's example, one would like to infer if a person would play tennis or not // based solely on four input variables. Those variables are all categorical, meaning that // there is no order between the possible values for the variable (i.e. there is no order // relationship between Sunny and Rain, one is not bigger nor smaller than the other, but are // just distinct). Moreover, the rows, or instances presented above represent days on which the // behavior of the person has been registered and annotated, pretty much building our set of // observation instances for learning: // Note: this example uses DataTables to represent the input data , but this is not required. DataTable data = new DataTable("Mitchell's Tennis Example"); data.Columns.Add("Day", "Outlook", "Temperature", "Humidity", "Wind", "PlayTennis"); data.Rows.Add("D1", "Sunny", "Hot", "High", "Weak", "No"); data.Rows.Add("D2", "Sunny", "Hot", "High", "Strong", "No"); data.Rows.Add("D3", "Overcast", "Hot", "High", "Weak", "Yes"); data.Rows.Add("D4", "Rain", "Mild", "High", "Weak", "Yes"); data.Rows.Add("D5", "Rain", "Cool", "Normal", "Weak", "Yes"); data.Rows.Add("D6", "Rain", "Cool", "Normal", "Strong", "No"); data.Rows.Add("D7", "Overcast", "Cool", "Normal", "Strong", "Yes"); data.Rows.Add("D8", "Sunny", "Mild", "High", "Weak", "No"); data.Rows.Add("D9", "Sunny", "Cool", "Normal", "Weak", "Yes"); data.Rows.Add("D10", "Rain", "Mild", "Normal", "Weak", "Yes"); data.Rows.Add("D11", "Sunny", "Mild", "Normal", "Strong", "Yes"); data.Rows.Add("D12", "Overcast", "Mild", "High", "Strong", "Yes"); data.Rows.Add("D13", "Overcast", "Hot", "Normal", "Weak", "Yes"); data.Rows.Add("D14", "Rain", "Mild", "High", "Strong", "No"); // In order to try to learn a decision tree, we will first convert this problem to a more simpler // representation. Since all variables are categories, it does not matter if they are represented // as strings, or numbers, since both are just symbols for the event they represent. Since numbers // are more easily representable than text string, we will convert the problem to use a discrete // alphabet through the use of a Accord.Statistics.Filters.Codification codebook.</para> // A codebook effectively transforms any distinct possible value for a variable into an integer // symbol. For example, “Sunny” could as well be represented by the integer label 0, “Overcast” // by “1”, Rain by “2”, and the same goes by for the other variables. So:</para> // Create a new codification codebook to // convert strings into integer symbols var codebook = new Codification(data); // Translate our training data into integer symbols using our codebook: DataTable symbols = codebook.Apply(data); int[][] inputs = symbols.ToArray<int>("Outlook", "Temperature", "Humidity", "Wind"); int[] outputs = symbols.ToArray<int>("PlayTennis"); // For this task, in which we have only categorical variables, the simplest choice // to induce a decision tree is to use the ID3 algorithm by Quinlan. Let’s do it: // Create a teacher ID3 algorithm var id3learning = new ID3Learning() { // Now that we already have our learning input/ouput pairs, we should specify our // decision tree. We will be trying to build a tree to predict the last column, entitled // “PlayTennis”. For this, we will be using the “Outlook”, “Temperature”, “Humidity” and // “Wind” as predictors (variables which will we will use for our decision). Since those // are categorical, we must specify, at the moment of creation of our tree, the // characteristics of each of those variables. So: new DecisionVariable("Outlook", 3), // 3 possible values (Sunny, overcast, rain) new DecisionVariable("Temperature", 3), // 3 possible values (Hot, mild, cool) new DecisionVariable("Humidity", 2), // 2 possible values (High, normal) new DecisionVariable("Wind", 2) // 2 possible values (Weak, strong) // Note: It is also possible to create a DecisionVariable[] from a codebook: // DecisionVariable[] attributes = DecisionVariable.FromCodebook(codebook); }; // Learn the training instances! DecisionTree tree = id3learning.Learn(inputs, outputs); // Compute the training error when predicting training instances double error = new ZeroOneLoss(outputs).Loss(tree.Decide(inputs)); // The tree can now be queried for new examples through // its decide method. For example, we can create a query int[] query = codebook.Transform(new[,] { { "Outlook", "Sunny" }, { "Temperature", "Hot" }, { "Humidity", "High" }, { "Wind", "Strong" } }); // And then predict the label using int predicted = tree.Decide(query); // result will be 0 // We can translate it back to strings using string answer = codebook.Revert("PlayTennis", predicted); // Answer will be: "No"
For more examples with discrete variables, please see ID3Learning
This example shows the simplest way to induce a decision tree with continuous variables.
// In this example, we will process the famous Fisher's Iris dataset in // which the task is to classify weather the features of an Iris flower // belongs to an Iris setosa, an Iris versicolor, or an Iris virginica: // // - https://en.wikipedia.org/wiki/Iris_flower_data_set // // First, let's load the dataset into an array of text that we can process string[][] text = Resources.iris_data.Split(new[] { "\r\n" }, StringSplitOptions.RemoveEmptyEntries).Apply(x => x.Split(',')); // The first four columns contain the flower features double[][] inputs = text.GetColumns(0, 1, 2, 3).To<double[][]>(); // The last column contains the expected flower type string[] labels = text.GetColumn(4); // Since the labels are represented as text, the first step is to convert // those text labels into integer class labels, so we can process them // more easily. For this, we will create a codebook to encode class labels: // var codebook = new Codification("Output", labels); // With the codebook, we can convert the labels: int[] outputs = codebook.Translate("Output", labels); // And we can use the C4.5 for learning: C45Learning teacher = new C45Learning(); // Finally induce the tree from the data: var tree = teacher.Learn(inputs, outputs); // To get the estimated class labels, we can use int[] predicted = tree.Decide(inputs); // The classification error (0.0266) can be computed as double error = new ZeroOneLoss(outputs).Loss(predicted); // Moreover, we may decide to convert our tree to a set of rules: DecisionSet rules = tree.ToRules(); // And using the codebook, we can inspect the tree reasoning: string ruleText = rules.ToString(codebook, "Output", System.Globalization.CultureInfo.InvariantCulture); // The output is: string expected = @"Iris-setosa =: (2 <= 2.45) Iris-versicolor =: (2 > 2.45) && (3 <= 1.75) && (0 <= 7.05) && (1 <= 2.85) Iris-versicolor =: (2 > 2.45) && (3 <= 1.75) && (0 <= 7.05) && (1 > 2.85) Iris-versicolor =: (2 > 2.45) && (3 > 1.75) && (0 <= 5.95) && (1 > 3.05) Iris-virginica =: (2 > 2.45) && (3 <= 1.75) && (0 > 7.05) Iris-virginica =: (2 > 2.45) && (3 > 1.75) && (0 > 5.95) Iris-virginica =: (2 > 2.45) && (3 > 1.75) && (0 <= 5.95) && (1 <= 3.05) ";
For more examples with continuous variables, please see C45Learning
The next example shows how to estimate the true performance of a decision tree model using cross-validation:
// Ensure we have reproducible results Accord.Math.Random.Generator.Seed = 0; // Get some data to be learned. We will be using the Wiconsin's // (Diagnostic) Breast Cancer dataset, where the goal is to determine // whether the characteristics extracted from a breast cancer exam // correspond to a malignant or benign type of cancer: var data = new WisconsinDiagnosticBreastCancer(); double[][] input = data.Features; // 569 samples, 30-dimensional features int[] output = data.ClassLabels; // 569 samples, 2 different class labels // Let's say we want to measure the cross-validation performance of // a decision tree with a maximum tree height of 5 and where variables // are able to join the decision path at most 2 times during evaluation: var cv = CrossValidation.Create( k: 10, // We will be using 10-fold cross validation learner: (p) => new C45Learning() // here we create the learning algorithm { Join = 2, MaxHeight = 5 }, // Now we have to specify how the tree performance should be measured: loss: (actual, expected, p) => new ZeroOneLoss(expected).Loss(actual), // This function can be used to perform any special // operations before the actual learning is done, but // here we will just leave it as simple as it can be: fit: (teacher, x, y, w) => teacher.Learn(x, y, w), // Finally, we have to pass the input and output data // that will be used in cross-validation. x: input, y: output ); // After the cross-validation object has been created, // we can call its .Learn method with the input and // output data that will be partitioned into the folds: var result = cv.Learn(input, output); // We can grab some information about the problem: int numberOfSamples = result.NumberOfSamples; // should be 569 int numberOfInputs = result.NumberOfInputs; // should be 30 int numberOfOutputs = result.NumberOfOutputs; // should be 2 double trainingError = result.Training.Mean; // should be 0.017771153143274855 double validationError = result.Validation.Mean; // should be 0.0755952380952381 // If desired, compute an aggregate confusion matrix for the validation sets: GeneralConfusionMatrix gcm = result.ToConfusionMatrix(input, output); double accuracy = gcm.Accuracy; // result should be 0.92442882249560632