DiscretizationTInput, TOutput Class

Value discretization preprocessing filter.

Inheritance Hierarchy

SystemObject
Accord.Statistics.FiltersBaseFilterDiscretizationTInput, TOutputOptions, DiscretizationTInput, TOutput
Accord.Statistics.FiltersDiscretizationTInput, TOutput

Namespace: Accord.Statistics.Filters
Assembly: Accord.Statistics (in Accord.Statistics.dll) Version: 3.8.0

Syntax

Copy

[SerializableAttribute]
public class Discretization<TInput, TOutput> : BaseFilter<DiscretizationTInput, TOutputOptions, Discretization<TInput, TOutput>>, 
	IAutoConfigurableFilter, IFilter, ITransform<TInput[], TOutput[]>, ICovariantTransform<TInput[], TOutput[]>, 
	ITransform, IUnsupervisedLearning<Discretization<TInput, TOutput>, TInput[], TOutput[]>

<SerializableAttribute>
Public Class Discretization(Of TInput, TOutput)
	Inherits BaseFilter(Of DiscretizationTInput, TOutputOptions, Discretization(Of TInput, TOutput))
	Implements IAutoConfigurableFilter, IFilter, ITransform(Of TInput(), TOutput()), 
	ICovariantTransform(Of TInput(), TOutput()), ITransform, IUnsupervisedLearning(Of Discretization(Of TInput, TOutput), TInput(), TOutput())

Request Example View Source

Type Parameters

TInput
TOutput

The DiscretizationTInput, TOutput type exposes the following members.

Constructors

	Name	Description
	DiscretizationTInput, TOutput	Creates a new Discretization Filter.
	DiscretizationTInput, TOutput(DataTable)	Creates a new Discretization Filter.
	DiscretizationTInput, TOutput(String)	Creates a new Discretization Filter.
	DiscretizationTInput, TOutput(DataTable, String)	Creates a new Discretization Filter.
	DiscretizationTInput, TOutput(String, Object)	Creates a new Discretization Filter.
	DiscretizationTInput, TOutput(String, TInput)	Creates a new Discretization Filter.

Top

Properties

	Name	Description
	Active	Gets or sets whether this filter is active. An inactive filter will repass the input table as output unchanged. (Inherited from BaseFilterTOptions, TFilter.)
	Columns	Gets the collection of filter options. (Inherited from BaseFilterTOptions, TFilter.)
	ItemInt32	Gets options associated with a given variable (data column). (Inherited from BaseFilterTOptions, TFilter.)
	ItemString	Gets options associated with a given variable (data column). (Inherited from BaseFilterTOptions, TFilter.)
	NumberOfInputs	Gets the number of inputs accepted by the model. (Inherited from BaseFilterTOptions, TFilter.)
	NumberOfOutputs	Gets the number of outputs generated by the model.
	Token	Gets or sets a cancellation token that can be used to stop the learning algorithm while it is running. (Inherited from BaseFilterTOptions, TFilter.)

Top

Methods

	Name	Description
	Add(TOptions)	Add a new column options definition to the collection. (Inherited from BaseFilterTOptions, TFilter.)
	Add(String, ExpressionFuncTInput, Boolean, ExpressionFuncTInput, TOutput)	Adds the specified matching rule to a column.
	Add(String, ExpressionFuncTInput, Boolean, TOutput)	Adds the specified matching rule to a column.
	Apply(DataTable)	Applies the Filter to a DataTable. (Inherited from BaseFilterTOptions, TFilter.)
	Apply(DataTable, String)	Applies the Filter to a DataTable. (Inherited from BaseFilterTOptions, TFilter.)
	Detect	Auto detects the filter options by analyzing a given DataTable.
	Equals	Determines whether the specified object is equal to the current object. (Inherited from Object.)
	Finalize	Allows an object to try to free resources and perform other cleanup operations before it is reclaimed by garbage collection. (Inherited from Object.)
	GetEnumerator	Returns an enumerator that iterates through the collection. (Inherited from BaseFilterTOptions, TFilter.)
	GetHashCode	Serves as the default hash function. (Inherited from Object.)
	GetType	Gets the Type of the current instance. (Inherited from Object.)
	Learn(DataTable, Double)	Learns a model that can map the given inputs to the desired outputs.
	Learn(TInput, Double)	Learns a model that can map the given inputs to the desired outputs.
	Learn(TInput, Double)	Learns a model that can map the given inputs to the desired outputs.
	MemberwiseClone	Creates a shallow copy of the current Object. (Inherited from Object.)
	OnAddingOptions	Called when a new column options definition is being added. Can be used to validate or modify these options beforehand. (Inherited from BaseFilterTOptions, TFilter.)
	ProcessFilter	Processes the current filter. (Overrides BaseFilterTOptions, TFilterProcessFilter(DataTable).)
	ToString	Returns a string that represents the current object. (Inherited from Object.)
	Transform(Object)	Translates an array of values into their codeword representation, assuming values are given in original order of columns.
	Transform(Object)	Translates an array of values into their codeword representation, assuming values are given in original order of columns.
	Transform(TInput)	Translates an array of values into their codeword representation, assuming values are given in original order of columns.
	Transform(TInput)	Translates a value of the given variables into their codeword representation.
	Transform(DataRow, String)	Translates an array of values into their codeword representation, assuming values are given in original order of columns.
	Transform(String, TInput)	Translates a value of a given variable into its codeword representation.
	Transform(String, TInput)	Translates a value of the given variables into their codeword representation.
	Transform(String, TInput)	Translates a value of the given variables into their codeword representation.
	Transform(String, TInput)	Translates a value of the given variables into their codeword representation.
	Transform(TInput, TOutput)	Applies the transformation to a set of input vectors, producing an associated set of output vectors.

Top

Extension Methods

	Name	Description
	HasMethod	Checks whether an object implements a method with the given name. (Defined by ExtensionMethods.)
	IsEqual	Compares two objects for equality, performing an elementwise comparison if the elements are vectors or matrices. (Defined by Matrix.)
	To(Type)	Overloaded. Converts an object into another type, irrespective of whether the conversion can be done at compile time or not. This can be used to convert generic types to numeric types during runtime. (Defined by ExtensionMethods.)
	ToT	Overloaded. Converts an object into another type, irrespective of whether the conversion can be done at compile time or not. This can be used to convert generic types to numeric types during runtime. (Defined by ExtensionMethods.)

Top

Remarks

This filter converts ranges of values into a different representation according to a set of rules. Please see the examples below to see how this filter can be used in practice.

Examples

The discretization filter can be used to convert any range of values into another representation. For example, let's say we have a dataset where a column represents percentages using floating point numbers, but we would like to discretize those numbers into more descriptive labels:

Copy

// Let's say we have some data representing the probability
// of seeing some particular species of birds across the U.S.:
string[] names = { "State", "Bird", "Percentage" };

object[][] data =
{
    new object[] { "Kansas", "Crow", 0.1 },
    new object[] { "Ohio", "Pardal", 0.5 },
    new object[] { "Hawaii", "Penguim", 0.7 }
};

// Create a new discretization filter for the given dataset:
var discretization = new Discretization<double, string>(names)
{
    { "Percentage", x => x >= 0.00 && x < 0.25, "lowest" },
    { "Percentage", x => x >= 0.25 && x < 0.50, "low" },
    { "Percentage", x => x >= 0.50 && x < 0.75, "medium" },
    { "Percentage", x => x >= 0.75 && x < 1.00, "likely" },
};

// Convert the data using the above conversion rules:
object[][] output = discretization.Transform(data);

// The output should be:
object[][] expected =
{
    new object[] { "Kansas", "Crow", "lowest" },
    new object[] { "Ohio", "Pardal", "medium" },
    new object[] { "Hawaii", "Penguim", "medium" }
};

The discretization filter can also be used to process DataTable like the Codification filter. It can also be used in combination with Codification to process datasets for classification, as shown in the example below:

Copy

            // In this example, we will be using a modified version of the famous Play Tennis 
            // example by Tom Mitchell (1998), where some values have been replaced by missing 
            // values. We will use NaN double values to represent values missing from the data.

            // Note: this example uses DataTables to represent the input data, 
            // but this is not required. The same could be performed using plain
            // double[][] matrices and vectors instead.
            DataTable data = new DataTable("Tennis Example with Missing Values");

            data.Columns.Add("Day", typeof(string));
            data.Columns.Add("Outlook", typeof(string));
            data.Columns.Add("Temperature", typeof(int));
            data.Columns.Add("Humidity", typeof(string));
            data.Columns.Add("Wind", typeof(string));
            data.Columns.Add("PlayTennis", typeof(string));

            data.Rows.Add("D1", "Sunny", 35, "High", "Weak", "No");
            data.Rows.Add("D2", null, 32, "High", "Strong", "No");
            data.Rows.Add("D3", null, null, "High", null, "Yes");
            data.Rows.Add("D4", "Rain", 25, "High", "Weak", "Yes");
            data.Rows.Add("D5", "Rain", 16, null, "Weak", "Yes");
            data.Rows.Add("D6", "Rain", 12, "Normal", "Strong", "No");
            data.Rows.Add("D7", "Overcast", "18", "Normal", "Strong", "Yes");
            data.Rows.Add("D8", null, 27, "High", null, "No");
            data.Rows.Add("D9", null, 17, "Normal", "Weak", "Yes");
            data.Rows.Add("D10", null, null, "Normal", null, "Yes");
            data.Rows.Add("D11", null, 23, "Normal", null, "Yes");
            data.Rows.Add("D12", "Overcast", 25, null, "Strong", "Yes");
            data.Rows.Add("D13", "Overcast", 33, null, "Weak", "Yes");
            data.Rows.Add("D14", "Rain", 24, "High", "Strong", "No");

            string[] inputNames = new[] { "Outlook", "Temperature", "Humidity", "Wind" };

            // Create a new discretization codebook to convert 
            // the numbers above into discrete, string labels:
            var discretization = new Discretization<double, string>()
            {
                { "Temperature", x => x >= 30 && x < 50, "Hot" },
                { "Temperature", x => x >= 20 && x < 30, "Mild" },
                { "Temperature", x => x >= 00 && x < 20, "Cool" },
            };

            // Use the discretization to convert all the data
            DataTable discrete = discretization.Apply(data);

            // Create a new codification codebook to convert 
            // the strings above into numeric, integer labels:
            var codebook = new Codification()
            {
                DefaultMissingValueReplacement = Double.NaN
            };

            // Use the codebook to convert all the data
            DataTable symbols = codebook.Apply(discrete);

            // Grab the training input and output instances:
            double[][] inputs = symbols.ToJagged(inputNames);
            int[] outputs = symbols.ToArray<int>("PlayTennis");

            // Create a new learning algorithm
            var teacher = new C45Learning()
            {
                Attributes = DecisionVariable.FromCodebook(codebook, inputNames)
            };

            // Use the learning algorithm to induce a new tree:
            DecisionTree tree = teacher.Learn(inputs, outputs);

            // To get the estimated class labels, we can use
            int[] predicted = tree.Decide(inputs);

            // The classification error (~0.214) can be computed as 
            double error = new ZeroOneLoss(outputs).Loss(predicted);

            // Moreover, we may decide to convert our tree to a set of rules:
            DecisionSet rules = tree.ToRules();

            // And using the codebook, we can inspect the tree reasoning:
            string ruleText = rules.ToString(codebook, "PlayTennis",
                System.Globalization.CultureInfo.InvariantCulture);

            // The output should be:
            string expected = @"No =: (Outlook == Sunny)
No =: (Outlook == Rain) && (Wind == Strong)
Yes =: (Outlook == Overcast)
Yes =: (Outlook == Rain) && (Wind == Weak)
";

Reference

Accord.Statistics.Filters Namespace

Accord.Statistics.FiltersCodification

DiscretizationAddLanguageSpecificTextSet("LST29EB1C0D_0?cs=&lt;|vb=(Of |cpp=&lt;|fs=&lt;'|nu=(");TInput, TOutputAddLanguageSpecificTextSet("LST29EB1C0D_1?cs=&gt;|vb=)|cpp=&gt;|fs=&gt;|nu=)"); Class

Type Parameters

Reference

DiscretizationTInput, TOutput Class