Click or drag to resize
Accord.NET (logo)

DiscretizationTInput, TOutput Class

Value discretization preprocessing filter.
Inheritance Hierarchy
  Accord.Statistics.FiltersBaseFilterDiscretizationTInput, TOutputOptions, DiscretizationTInput, TOutput
    Accord.Statistics.FiltersDiscretizationTInput, TOutput

Namespace:  Accord.Statistics.Filters
Assembly:  Accord.Statistics (in Accord.Statistics.dll) Version: 3.8.0
public class Discretization<TInput, TOutput> : BaseFilter<DiscretizationTInput, TOutputOptions, Discretization<TInput, TOutput>>, 
	IAutoConfigurableFilter, IFilter, ITransform<TInput[], TOutput[]>, ICovariantTransform<TInput[], TOutput[]>, 
	ITransform, IUnsupervisedLearning<Discretization<TInput, TOutput>, TInput[], TOutput[]>
Request Example View Source

Type Parameters


The DiscretizationTInput, TOutput type exposes the following members.

Public methodDiscretizationTInput, TOutput
Creates a new Discretization Filter.
Public methodDiscretizationTInput, TOutput(DataTable)
Creates a new Discretization Filter.
Public methodDiscretizationTInput, TOutput(String)
Creates a new Discretization Filter.
Public methodDiscretizationTInput, TOutput(DataTable, String)
Creates a new Discretization Filter.
Public methodDiscretizationTInput, TOutput(String, Object)
Creates a new Discretization Filter.
Public methodDiscretizationTInput, TOutput(String, TInput)
Creates a new Discretization Filter.
Public propertyActive
Gets or sets whether this filter is active. An inactive filter will repass the input table as output unchanged.
(Inherited from BaseFilterTOptions, TFilter.)
Public propertyColumns
Gets the collection of filter options.
(Inherited from BaseFilterTOptions, TFilter.)
Public propertyItemInt32
Gets options associated with a given variable (data column).
(Inherited from BaseFilterTOptions, TFilter.)
Public propertyItemString
Gets options associated with a given variable (data column).
(Inherited from BaseFilterTOptions, TFilter.)
Public propertyNumberOfInputs
Gets the number of inputs accepted by the model.
(Inherited from BaseFilterTOptions, TFilter.)
Public propertyNumberOfOutputs
Gets the number of outputs generated by the model.
Public propertyToken
Gets or sets a cancellation token that can be used to stop the learning algorithm while it is running.
(Inherited from BaseFilterTOptions, TFilter.)
Public methodAdd(TOptions)
Add a new column options definition to the collection.
(Inherited from BaseFilterTOptions, TFilter.)
Public methodAdd(String, ExpressionFuncTInput, Boolean, ExpressionFuncTInput, TOutput)
Adds the specified matching rule to a column.
Public methodAdd(String, ExpressionFuncTInput, Boolean, TOutput)
Adds the specified matching rule to a column.
Public methodApply(DataTable)
Applies the Filter to a DataTable.
(Inherited from BaseFilterTOptions, TFilter.)
Public methodApply(DataTable, String)
Applies the Filter to a DataTable.
(Inherited from BaseFilterTOptions, TFilter.)
Public methodDetect
Auto detects the filter options by analyzing a given DataTable.
Public methodEquals
Determines whether the specified object is equal to the current object.
(Inherited from Object.)
Protected methodFinalize
Allows an object to try to free resources and perform other cleanup operations before it is reclaimed by garbage collection.
(Inherited from Object.)
Public methodGetEnumerator
Returns an enumerator that iterates through the collection.
(Inherited from BaseFilterTOptions, TFilter.)
Public methodGetHashCode
Serves as the default hash function.
(Inherited from Object.)
Public methodGetType
Gets the Type of the current instance.
(Inherited from Object.)
Public methodLearn(DataTable, Double)
Learns a model that can map the given inputs to the desired outputs.
Public methodLearn(TInput, Double)
Learns a model that can map the given inputs to the desired outputs.
Public methodLearn(TInput, Double)
Learns a model that can map the given inputs to the desired outputs.
Protected methodMemberwiseClone
Creates a shallow copy of the current Object.
(Inherited from Object.)
Protected methodOnAddingOptions
Called when a new column options definition is being added. Can be used to validate or modify these options beforehand.
(Inherited from BaseFilterTOptions, TFilter.)
Protected methodProcessFilter
Processes the current filter.
(Overrides BaseFilterTOptions, TFilterProcessFilter(DataTable).)
Public methodToString
Returns a string that represents the current object.
(Inherited from Object.)
Public methodTransform(Object)
Translates an array of values into their codeword representation, assuming values are given in original order of columns.
Public methodTransform(Object)
Translates an array of values into their codeword representation, assuming values are given in original order of columns.
Public methodTransform(TInput)
Translates an array of values into their codeword representation, assuming values are given in original order of columns.
Public methodTransform(TInput)
Translates a value of the given variables into their codeword representation.
Public methodTransform(DataRow, String)
Translates an array of values into their codeword representation, assuming values are given in original order of columns.
Public methodTransform(String, TInput)
Translates a value of a given variable into its codeword representation.
Public methodTransform(String, TInput)
Translates a value of the given variables into their codeword representation.
Public methodTransform(String, TInput)
Translates a value of the given variables into their codeword representation.
Public methodTransform(String, TInput)
Translates a value of the given variables into their codeword representation.
Public methodTransform(TInput, TOutput)
Applies the transformation to a set of input vectors, producing an associated set of output vectors.
Extension Methods
Public Extension MethodHasMethod
Checks whether an object implements a method with the given name.
(Defined by ExtensionMethods.)
Public Extension MethodIsEqual
Compares two objects for equality, performing an elementwise comparison if the elements are vectors or matrices.
(Defined by Matrix.)
Public Extension MethodTo(Type)Overloaded.
Converts an object into another type, irrespective of whether the conversion can be done at compile time or not. This can be used to convert generic types to numeric types during runtime.
(Defined by ExtensionMethods.)
Public Extension MethodToTOverloaded.
Converts an object into another type, irrespective of whether the conversion can be done at compile time or not. This can be used to convert generic types to numeric types during runtime.
(Defined by ExtensionMethods.)
This filter converts ranges of values into a different representation according to a set of rules. Please see the examples below to see how this filter can be used in practice.

The discretization filter can be used to convert any range of values into another representation. For example, let's say we have a dataset where a column represents percentages using floating point numbers, but we would like to discretize those numbers into more descriptive labels:

// Let's say we have some data representing the probability
// of seeing some particular species of birds across the U.S.:
string[] names = { "State", "Bird", "Percentage" };

object[][] data =
    new object[] { "Kansas", "Crow", 0.1 },
    new object[] { "Ohio", "Pardal", 0.5 },
    new object[] { "Hawaii", "Penguim", 0.7 }

// Create a new discretization filter for the given dataset:
var discretization = new Discretization<double, string>(names)
    { "Percentage", x => x >= 0.00 && x < 0.25, "lowest" },
    { "Percentage", x => x >= 0.25 && x < 0.50, "low" },
    { "Percentage", x => x >= 0.50 && x < 0.75, "medium" },
    { "Percentage", x => x >= 0.75 && x < 1.00, "likely" },

// Convert the data using the above conversion rules:
object[][] output = discretization.Transform(data);

// The output should be:
object[][] expected =
    new object[] { "Kansas", "Crow", "lowest" },
    new object[] { "Ohio", "Pardal", "medium" },
    new object[] { "Hawaii", "Penguim", "medium" }

The discretization filter can also be used to process DataTable like the Codification filter. It can also be used in combination with Codification to process datasets for classification, as shown in the example below:

            // In this example, we will be using a modified version of the famous Play Tennis 
            // example by Tom Mitchell (1998), where some values have been replaced by missing 
            // values. We will use NaN double values to represent values missing from the data.

            // Note: this example uses DataTables to represent the input data, 
            // but this is not required. The same could be performed using plain
            // double[][] matrices and vectors instead.
            DataTable data = new DataTable("Tennis Example with Missing Values");

            data.Columns.Add("Day", typeof(string));
            data.Columns.Add("Outlook", typeof(string));
            data.Columns.Add("Temperature", typeof(int));
            data.Columns.Add("Humidity", typeof(string));
            data.Columns.Add("Wind", typeof(string));
            data.Columns.Add("PlayTennis", typeof(string));

            data.Rows.Add("D1", "Sunny", 35, "High", "Weak", "No");
            data.Rows.Add("D2", null, 32, "High", "Strong", "No");
            data.Rows.Add("D3", null, null, "High", null, "Yes");
            data.Rows.Add("D4", "Rain", 25, "High", "Weak", "Yes");
            data.Rows.Add("D5", "Rain", 16, null, "Weak", "Yes");
            data.Rows.Add("D6", "Rain", 12, "Normal", "Strong", "No");
            data.Rows.Add("D7", "Overcast", "18", "Normal", "Strong", "Yes");
            data.Rows.Add("D8", null, 27, "High", null, "No");
            data.Rows.Add("D9", null, 17, "Normal", "Weak", "Yes");
            data.Rows.Add("D10", null, null, "Normal", null, "Yes");
            data.Rows.Add("D11", null, 23, "Normal", null, "Yes");
            data.Rows.Add("D12", "Overcast", 25, null, "Strong", "Yes");
            data.Rows.Add("D13", "Overcast", 33, null, "Weak", "Yes");
            data.Rows.Add("D14", "Rain", 24, "High", "Strong", "No");

            string[] inputNames = new[] { "Outlook", "Temperature", "Humidity", "Wind" };

            // Create a new discretization codebook to convert 
            // the numbers above into discrete, string labels:
            var discretization = new Discretization<double, string>()
                { "Temperature", x => x >= 30 && x < 50, "Hot" },
                { "Temperature", x => x >= 20 && x < 30, "Mild" },
                { "Temperature", x => x >= 00 && x < 20, "Cool" },

            // Use the discretization to convert all the data
            DataTable discrete = discretization.Apply(data);

            // Create a new codification codebook to convert 
            // the strings above into numeric, integer labels:
            var codebook = new Codification()
                DefaultMissingValueReplacement = Double.NaN

            // Use the codebook to convert all the data
            DataTable symbols = codebook.Apply(discrete);

            // Grab the training input and output instances:
            double[][] inputs = symbols.ToJagged(inputNames);
            int[] outputs = symbols.ToArray<int>("PlayTennis");

            // Create a new learning algorithm
            var teacher = new C45Learning()
                Attributes = DecisionVariable.FromCodebook(codebook, inputNames)

            // Use the learning algorithm to induce a new tree:
            DecisionTree tree = teacher.Learn(inputs, outputs);

            // To get the estimated class labels, we can use
            int[] predicted = tree.Decide(inputs);

            // The classification error (~0.214) can be computed as 
            double error = new ZeroOneLoss(outputs).Loss(predicted);

            // Moreover, we may decide to convert our tree to a set of rules:
            DecisionSet rules = tree.ToRules();

            // And using the codebook, we can inspect the tree reasoning:
            string ruleText = rules.ToString(codebook, "PlayTennis",

            // The output should be:
            string expected = @"No =: (Outlook == Sunny)
No =: (Outlook == Rain) && (Wind == Strong)
Yes =: (Outlook == Overcast)
Yes =: (Outlook == Rain) && (Wind == Weak)
See Also