DiscretizationTInput, TOutput Class |
Namespace: Accord.Statistics.Filters
[SerializableAttribute] public class Discretization<TInput, TOutput> : BaseFilter<DiscretizationTInput, TOutputOptions, Discretization<TInput, TOutput>>, IAutoConfigurableFilter, IFilter, ITransform<TInput[], TOutput[]>, ICovariantTransform<TInput[], TOutput[]>, ITransform, IUnsupervisedLearning<Discretization<TInput, TOutput>, TInput[], TOutput[]>
The DiscretizationTInput, TOutput type exposes the following members.
Name | Description | |
---|---|---|
DiscretizationTInput, TOutput |
Creates a new Discretization Filter.
| |
DiscretizationTInput, TOutput(DataTable) |
Creates a new Discretization Filter.
| |
DiscretizationTInput, TOutput(String) |
Creates a new Discretization Filter.
| |
DiscretizationTInput, TOutput(DataTable, String) |
Creates a new Discretization Filter.
| |
DiscretizationTInput, TOutput(String, Object) |
Creates a new Discretization Filter.
| |
DiscretizationTInput, TOutput(String, TInput) |
Creates a new Discretization Filter.
|
Name | Description | |
---|---|---|
Active |
Gets or sets whether this filter is active. An inactive
filter will repass the input table as output unchanged.
(Inherited from BaseFilterTOptions, TFilter.) | |
Columns |
Gets the collection of filter options.
(Inherited from BaseFilterTOptions, TFilter.) | |
ItemInt32 |
Gets options associated with a given variable (data column).
(Inherited from BaseFilterTOptions, TFilter.) | |
ItemString |
Gets options associated with a given variable (data column).
(Inherited from BaseFilterTOptions, TFilter.) | |
NumberOfInputs |
Gets the number of inputs accepted by the model.
(Inherited from BaseFilterTOptions, TFilter.) | |
NumberOfOutputs |
Gets the number of outputs generated by the model.
| |
Token |
Gets or sets a cancellation token that can be used to
stop the learning algorithm while it is running.
(Inherited from BaseFilterTOptions, TFilter.) |
Name | Description | |
---|---|---|
Add(TOptions) |
Add a new column options definition to the collection.
(Inherited from BaseFilterTOptions, TFilter.) | |
Add(String, ExpressionFuncTInput, Boolean, ExpressionFuncTInput, TOutput) |
Adds the specified matching rule to a column.
| |
Add(String, ExpressionFuncTInput, Boolean, TOutput) |
Adds the specified matching rule to a column.
| |
Apply(DataTable) |
Applies the Filter to a DataTable.
(Inherited from BaseFilterTOptions, TFilter.) | |
Apply(DataTable, String) |
Applies the Filter to a DataTable.
(Inherited from BaseFilterTOptions, TFilter.) | |
Detect |
Auto detects the filter options by analyzing a given DataTable.
| |
Equals | Determines whether the specified object is equal to the current object. (Inherited from Object.) | |
Finalize | Allows an object to try to free resources and perform other cleanup operations before it is reclaimed by garbage collection. (Inherited from Object.) | |
GetEnumerator |
Returns an enumerator that iterates through the collection.
(Inherited from BaseFilterTOptions, TFilter.) | |
GetHashCode | Serves as the default hash function. (Inherited from Object.) | |
GetType | Gets the Type of the current instance. (Inherited from Object.) | |
Learn(DataTable, Double) |
Learns a model that can map the given inputs to the desired outputs.
| |
Learn(TInput, Double) |
Learns a model that can map the given inputs to the desired outputs.
| |
Learn(TInput, Double) |
Learns a model that can map the given inputs to the desired outputs.
| |
MemberwiseClone | Creates a shallow copy of the current Object. (Inherited from Object.) | |
OnAddingOptions |
Called when a new column options definition is being added.
Can be used to validate or modify these options beforehand.
(Inherited from BaseFilterTOptions, TFilter.) | |
ProcessFilter |
Processes the current filter.
(Overrides BaseFilterTOptions, TFilterProcessFilter(DataTable).) | |
ToString | Returns a string that represents the current object. (Inherited from Object.) | |
Transform(Object) |
Translates an array of values into their codeword representation,
assuming values are given in original order of columns.
| |
Transform(Object) |
Translates an array of values into their codeword representation,
assuming values are given in original order of columns.
| |
Transform(TInput) |
Translates an array of values into their codeword representation,
assuming values are given in original order of columns.
| |
Transform(TInput) |
Translates a value of the given variables
into their codeword representation.
| |
Transform(DataRow, String) |
Translates an array of values into their codeword representation,
assuming values are given in original order of columns.
| |
Transform(String, TInput) |
Translates a value of a given variable into its codeword representation.
| |
Transform(String, TInput) |
Translates a value of the given variables
into their codeword representation.
| |
Transform(String, TInput) |
Translates a value of the given variables
into their codeword representation.
| |
Transform(String, TInput) |
Translates a value of the given variables
into their codeword representation.
| |
Transform(TInput, TOutput) |
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
|
Name | Description | |
---|---|---|
HasMethod |
Checks whether an object implements a method with the given name.
(Defined by ExtensionMethods.) | |
IsEqual |
Compares two objects for equality, performing an elementwise
comparison if the elements are vectors or matrices.
(Defined by Matrix.) | |
To(Type) | Overloaded.
Converts an object into another type, irrespective of whether
the conversion can be done at compile time or not. This can be
used to convert generic types to numeric types during runtime.
(Defined by ExtensionMethods.) | |
ToT | Overloaded.
Converts an object into another type, irrespective of whether
the conversion can be done at compile time or not. This can be
used to convert generic types to numeric types during runtime.
(Defined by ExtensionMethods.) |
The discretization filter can be used to convert any range of values into another representation. For example, let's say we have a dataset where a column represents percentages using floating point numbers, but we would like to discretize those numbers into more descriptive labels:
// Let's say we have some data representing the probability // of seeing some particular species of birds across the U.S.: string[] names = { "State", "Bird", "Percentage" }; object[][] data = { new object[] { "Kansas", "Crow", 0.1 }, new object[] { "Ohio", "Pardal", 0.5 }, new object[] { "Hawaii", "Penguim", 0.7 } }; // Create a new discretization filter for the given dataset: var discretization = new Discretization<double, string>(names) { { "Percentage", x => x >= 0.00 && x < 0.25, "lowest" }, { "Percentage", x => x >= 0.25 && x < 0.50, "low" }, { "Percentage", x => x >= 0.50 && x < 0.75, "medium" }, { "Percentage", x => x >= 0.75 && x < 1.00, "likely" }, }; // Convert the data using the above conversion rules: object[][] output = discretization.Transform(data); // The output should be: object[][] expected = { new object[] { "Kansas", "Crow", "lowest" }, new object[] { "Ohio", "Pardal", "medium" }, new object[] { "Hawaii", "Penguim", "medium" } };
The discretization filter can also be used to process DataTable like the Codification filter. It can also be used in combination with Codification to process datasets for classification, as shown in the example below:
// In this example, we will be using a modified version of the famous Play Tennis // example by Tom Mitchell (1998), where some values have been replaced by missing // values. We will use NaN double values to represent values missing from the data. // Note: this example uses DataTables to represent the input data, // but this is not required. The same could be performed using plain // double[][] matrices and vectors instead. DataTable data = new DataTable("Tennis Example with Missing Values"); data.Columns.Add("Day", typeof(string)); data.Columns.Add("Outlook", typeof(string)); data.Columns.Add("Temperature", typeof(int)); data.Columns.Add("Humidity", typeof(string)); data.Columns.Add("Wind", typeof(string)); data.Columns.Add("PlayTennis", typeof(string)); data.Rows.Add("D1", "Sunny", 35, "High", "Weak", "No"); data.Rows.Add("D2", null, 32, "High", "Strong", "No"); data.Rows.Add("D3", null, null, "High", null, "Yes"); data.Rows.Add("D4", "Rain", 25, "High", "Weak", "Yes"); data.Rows.Add("D5", "Rain", 16, null, "Weak", "Yes"); data.Rows.Add("D6", "Rain", 12, "Normal", "Strong", "No"); data.Rows.Add("D7", "Overcast", "18", "Normal", "Strong", "Yes"); data.Rows.Add("D8", null, 27, "High", null, "No"); data.Rows.Add("D9", null, 17, "Normal", "Weak", "Yes"); data.Rows.Add("D10", null, null, "Normal", null, "Yes"); data.Rows.Add("D11", null, 23, "Normal", null, "Yes"); data.Rows.Add("D12", "Overcast", 25, null, "Strong", "Yes"); data.Rows.Add("D13", "Overcast", 33, null, "Weak", "Yes"); data.Rows.Add("D14", "Rain", 24, "High", "Strong", "No"); string[] inputNames = new[] { "Outlook", "Temperature", "Humidity", "Wind" }; // Create a new discretization codebook to convert // the numbers above into discrete, string labels: var discretization = new Discretization<double, string>() { { "Temperature", x => x >= 30 && x < 50, "Hot" }, { "Temperature", x => x >= 20 && x < 30, "Mild" }, { "Temperature", x => x >= 00 && x < 20, "Cool" }, }; // Use the discretization to convert all the data DataTable discrete = discretization.Apply(data); // Create a new codification codebook to convert // the strings above into numeric, integer labels: var codebook = new Codification() { DefaultMissingValueReplacement = Double.NaN }; // Use the codebook to convert all the data DataTable symbols = codebook.Apply(discrete); // Grab the training input and output instances: double[][] inputs = symbols.ToJagged(inputNames); int[] outputs = symbols.ToArray<int>("PlayTennis"); // Create a new learning algorithm var teacher = new C45Learning() { Attributes = DecisionVariable.FromCodebook(codebook, inputNames) }; // Use the learning algorithm to induce a new tree: DecisionTree tree = teacher.Learn(inputs, outputs); // To get the estimated class labels, we can use int[] predicted = tree.Decide(inputs); // The classification error (~0.214) can be computed as double error = new ZeroOneLoss(outputs).Loss(predicted); // Moreover, we may decide to convert our tree to a set of rules: DecisionSet rules = tree.ToRules(); // And using the codebook, we can inspect the tree reasoning: string ruleText = rules.ToString(codebook, "PlayTennis", System.Globalization.CultureInfo.InvariantCulture); // The output should be: string expected = @"No =: (Outlook == Sunny) No =: (Outlook == Rain) && (Wind == Strong) Yes =: (Outlook == Overcast) Yes =: (Outlook == Rain) && (Wind == Weak) ";