Click or drag to resize
Accord.NET (logo)

CodificationT Class

Codification Filter class.
Inheritance Hierarchy
SystemObject
  Accord.Statistics.FiltersBaseFilterCodificationTOptions, CodificationT
    Accord.Statistics.FiltersCodificationT
      Accord.Statistics.FiltersCodification

Namespace:  Accord.Statistics.Filters
Assembly:  Accord.Statistics (in Accord.Statistics.dll) Version: 3.8.0
Syntax
[SerializableAttribute]
public class Codification<T> : BaseFilter<CodificationTOptions, Codification<T>>, 
	ITransform<T[], double[]>, ICovariantTransform<T[], double[]>, 
	ITransform, IUnsupervisedLearning<Codification<T>, T[], double[]>, 
	ITransform<T[], int[]>, ICovariantTransform<T[], int[]>, IUnsupervisedLearning<Codification<T>, T[], int[]>
Request Example View Source

Type Parameters

T
The object type that needs to be codified. Default is string.

The CodificationT type exposes the following members.

Constructors
  NameDescription
Public methodCodificationT
Creates a new Codification Filter.
Public methodCodificationT(DataTable)
Creates a new Codification Filter.
Public methodCodificationT(DataTable, String)
Creates a new Codification Filter.
Public methodCodificationT(String, T)
Creates a new Codification Filter.
Public methodCodificationT(String, T)
Creates a new Codification Filter.
Public methodCodificationT(String, T)
Creates a new Codification Filter.
Top
Properties
  NameDescription
Public propertyActive
Gets or sets whether this filter is active. An inactive filter will repass the input table as output unchanged.
(Inherited from BaseFilterTOptions, TFilter.)
Public propertyColumns
Gets the collection of filter options.
(Inherited from BaseFilterTOptions, TFilter.)
Public propertyDefaultMissingValueReplacement
Gets or sets the default value to be used as a replacement for missing values. Default is to use System.DBNull.Value.
Public propertyItemInt32
Gets options associated with a given variable (data column).
(Inherited from BaseFilterTOptions, TFilter.)
Public propertyItemString
Gets options associated with a given variable (data column).
(Inherited from BaseFilterTOptions, TFilter.)
Public propertyNumberOfInputs
Gets the number of inputs accepted by the model.
(Inherited from BaseFilterTOptions, TFilter.)
Public propertyNumberOfOutputs
Gets the number of outputs generated by the model.
Public propertyToken
Gets or sets a cancellation token that can be used to stop the learning algorithm while it is running.
(Inherited from BaseFilterTOptions, TFilter.)
Top
Methods
  NameDescription
Public methodAdd(TOptions)
Add a new column options definition to the collection.
(Inherited from BaseFilterTOptions, TFilter.)
Public methodAdd(CodificationVariable)
Adds a new column options to this filter's collection, specifying how a particular column should be processed by the filter..
Public methodAdd(String, CodificationVariable)
Adds a new column options to this filter's collection, specifying how a particular column should be processed by the filter..
Public methodAdd(String, CodificationVariable, T)
Adds a new column options to this filter's collection, specifying how a particular column should be processed by the filter..
Public methodAdd(String, CodificationVariable, T)
Adds a new column options to this filter's collection, specifying how a particular column should be processed by the filter..
Public methodApply(DataTable)
Applies the Filter to a DataTable.
(Inherited from BaseFilterTOptions, TFilter.)
Public methodApply(DataTable, String)
Applies the Filter to a DataTable.
(Inherited from BaseFilterTOptions, TFilter.)
Public methodEquals
Determines whether the specified object is equal to the current object.
(Inherited from Object.)
Protected methodFinalize
Allows an object to try to free resources and perform other cleanup operations before it is reclaimed by garbage collection.
(Inherited from Object.)
Public methodGetEnumerator
Returns an enumerator that iterates through the collection.
(Inherited from BaseFilterTOptions, TFilter.)
Public methodGetHashCode
Serves as the default hash function.
(Inherited from Object.)
Public methodGetType
Gets the Type of the current instance.
(Inherited from Object.)
Public methodLearn(DataTable, Double)
Learns a model that can map the given inputs to the desired outputs.
Public methodLearn(T, Double)
Learns a model that can map the given inputs to the desired outputs.
Public methodLearn(T, Double)
Learns a model that can map the given inputs to the desired outputs.
Protected methodMemberwiseClone
Creates a shallow copy of the current Object.
(Inherited from Object.)
Protected methodOnAddingOptions
Called when a new column options definition is being added. Can be used to validate or modify these options beforehand.
(Overrides BaseFilterTOptions, TFilterOnAddingOptions(TOptions).)
Protected methodProcessFilter
Processes the current filter.
(Overrides BaseFilterTOptions, TFilterProcessFilter(DataTable).)
Public methodRevert(Int32)
Translates an integer (codeword) representation of the value of a given variable into its original value.
Public methodRevert(String, Int32)
Translates an integer (codeword) representation of the value of a given variable into its original value.
Public methodRevert(String, Int32)
Translates an integer (codeword) representation of the value of a given variable into its original value.
Public methodRevert(String, Int32)
Translates the integer (codeword) representations of the values of the given variables into their original values.
Public methodToDouble
Converts this instance into a transform that can generate double[].
Public methodToString
Returns a string that represents the current object.
(Inherited from Object.)
Public methodTransform(T)
Translates an array of values into their integer representation, assuming values are given in original order of columns.
Public methodTransform(T)
Translates a value of the given variables into their integer (codeword) representation.
Public methodTransform(DataRow, String)
Translates an array of values into their integer representation, assuming values are given in original order of columns.
Public methodTransform(DataTable, String)
Translates an array of values into their integer representation, assuming values are given in original order of columns.
Public methodTransform(String, T)
Translates a value of a given variable into its integer (codeword) representation.
Public methodTransform(String, T)
Translates a value of the given variables into their integer (codeword) representation.
Public methodTransform(String, T)
Translates a value of the given variables into their integer (codeword) representation.
Public methodTransform(String, T)
Translates a value of the given variables into their integer (codeword) representation.
Public methodTransform(T, Double)
Applies the transformation to a set of input vectors, producing an associated set of output vectors.
Public methodTransform(T, Int32)
Translates a value of the given variables into their integer (codeword) representation.
Public methodTransform(DataRow, String, String)
Translates an array of values into their integer representation, assuming values are given in original order of columns.
Public methodTransform(DataTable, String, String)
Translates an array of values into their integer representation, assuming values are given in original order of columns.
Top
Extension Methods
  NameDescription
Public Extension MethodHasMethod
Checks whether an object implements a method with the given name.
(Defined by ExtensionMethods.)
Public Extension MethodIsEqual
Compares two objects for equality, performing an elementwise comparison if the elements are vectors or matrices.
(Defined by Matrix.)
Public Extension MethodTo(Type)Overloaded.
Converts an object into another type, irrespective of whether the conversion can be done at compile time or not. This can be used to convert generic types to numeric types during runtime.
(Defined by ExtensionMethods.)
Public Extension MethodToTOverloaded.
Converts an object into another type, irrespective of whether the conversion can be done at compile time or not. This can be used to convert generic types to numeric types during runtime.
(Defined by ExtensionMethods.)
Top
Remarks

The codification filter performs an integer codification of classes in given in a string form. An unique integer identifier will be assigned for each of the string classes.

Examples

Every Learn() method in the framework expects the class labels to be contiguous and zero-indexed, meaning that if there is a classification problem with n classes, all class labels must be numbers ranging from 0 to n-1. However, not every dataset might be in this format and sometimes we will have to pre-process the data to be in this format. The example below shows how to use the Codification class to perform such pre-processing.

// Let's say we have the following data to be classified
// into three possible classes. Those are the samples:
// 
double[][] inputs =
{
    //               input         output
    new double[] { 0, 1, 1, 0 }, //  0 
    new double[] { 0, 1, 0, 0 }, //  0
    new double[] { 0, 0, 1, 0 }, //  0
    new double[] { 0, 1, 1, 0 }, //  0
    new double[] { 0, 1, 0, 0 }, //  0
    new double[] { 1, 0, 0, 0 }, //  1
    new double[] { 1, 0, 0, 0 }, //  1
    new double[] { 1, 0, 0, 1 }, //  1
    new double[] { 0, 0, 0, 1 }, //  1
    new double[] { 0, 0, 0, 1 }, //  1
    new double[] { 1, 1, 1, 1 }, //  2
    new double[] { 1, 0, 1, 1 }, //  2
    new double[] { 1, 1, 0, 1 }, //  2
    new double[] { 0, 1, 1, 1 }, //  2
    new double[] { 1, 1, 1, 1 }, //  2
};

// Now, suppose that our class labels are not contiguous. We
// have 3 classes, but they have the class labels 5, 1, and 8
// respectively. In this case, we can use a Codification filter
// to obtain a contiguous zero-indexed labeling before learning
int[] output_labels =
{
    5, 5, 5, 5, 5,
    1, 1, 1, 1, 1,
    8, 8, 8, 8, 8,
};

// Create a codification object to obtain a output mapping
var codebook = new Codification<int>().Learn(output_labels);

// Transform the original labels using the codebook
int[] outputs = codebook.Transform(output_labels);

// Create the multi-class learning algorithm for the machine
var teacher = new MulticlassSupportVectorLearning<Gaussian>()
{
    // Configure the learning algorithm to use SMO to train the
    //  underlying SVMs in each of the binary class subproblems.
    Learner = (param) => new SequentialMinimalOptimization<Gaussian>()
    {
        // Estimate a suitable guess for the Gaussian kernel's parameters.
        // This estimate can serve as a starting point for a grid search.
        UseKernelEstimation = true
    }
};

// The following line is only needed to ensure reproducible results. Please remove it to enable full parallelization
teacher.ParallelOptions.MaxDegreeOfParallelism = 1; // (Remove, comment, or change this line to enable full parallelism)

// Learn a machine
var machine = teacher.Learn(inputs, outputs);

// Obtain class predictions for each sample
int[] predicted = machine.Decide(inputs);

// Translate the integers back to the original lagbels
int[] predicted_labels = codebook.Revert(predicted);

Most classifiers in the framework also expect the input data to be of the same nature, i.e. continuous. The codification filter can also be used to convert discrete, categorical, ordinal and baseline categorical variables into continuous vectors that can be fed to other machine learning algorithms, such as K-Means:.

Accord.Math.Random.Generator.Seed = 0;

// Declare some mixed discrete and continuous observations
double[][] observations =
{
    //             (categorical) (discrete) (continuous)
    new double[] {       1,          -1,        -2.2      },
    new double[] {       1,          -6,        -5.5      },
    new double[] {       2,           1,         1.1      },
    new double[] {       2,           2,         1.2      },
    new double[] {       2,           2,         2.6      },
    new double[] {       3,           2,         1.4      },
    new double[] {       3,           4,         5.2      },
    new double[] {       1,           6,         5.1      },
    new double[] {       1,           6,         5.9      },
};

// Create a new codification algorithm to convert 
// the mixed variables above into all continuous:
var codification = new Codification<double>()
{
    CodificationVariable.Categorical,
    CodificationVariable.Discrete,
    CodificationVariable.Continuous
};

// Learn the codification from observations
var model = codification.Learn(observations);

// Transform the mixed observations into only continuous:
double[][] newObservations = model.ToDouble().Transform(observations);

// (newObservations will be equivalent to)
double[][] expected =
{
    //               (one hot)    (discrete)    (continuous)
    new double[] {    1, 0, 0,        -1,          -2.2      },
    new double[] {    1, 0, 0,        -6,          -5.5      },
    new double[] {    0, 1, 0,         1,           1.1      },
    new double[] {    0, 1, 0,         2,           1.2      },
    new double[] {    0, 1, 0,         2,           2.6      },
    new double[] {    0, 0, 1,         2,           1.4      },
    new double[] {    0, 0, 1,         4,           5.2      },
    new double[] {    1, 0, 0,         6,           5.1      },
    new double[] {    1, 0, 0,         6,           5.9      },
};

// Create a new K-Means algorithm
KMeans kmeans = new KMeans(k: 3);

// Compute and retrieve the data centroids
var clusters = kmeans.Learn(observations);

// Use the centroids to parition all the data
int[] labels = clusters.Decide(observations);

For more examples, please see the documentation page for the non-generic Codification filter.

See Also