CodificationT Class |
Namespace: Accord.Statistics.Filters
[SerializableAttribute] public class Codification<T> : BaseFilter<CodificationTOptions, Codification<T>>, ITransform<T[], double[]>, ICovariantTransform<T[], double[]>, ITransform, IUnsupervisedLearning<Codification<T>, T[], double[]>, ITransform<T[], int[]>, ICovariantTransform<T[], int[]>, IUnsupervisedLearning<Codification<T>, T[], int[]>
The CodificationT type exposes the following members.
Name | Description | |
---|---|---|
CodificationT |
Creates a new Codification Filter.
| |
CodificationT(DataTable) |
Creates a new Codification Filter.
| |
CodificationT(DataTable, String) |
Creates a new Codification Filter.
| |
CodificationT(String, T) |
Creates a new Codification Filter.
| |
CodificationT(String, T) |
Creates a new Codification Filter.
| |
CodificationT(String, T) |
Creates a new Codification Filter.
|
Name | Description | |
---|---|---|
Active |
Gets or sets whether this filter is active. An inactive
filter will repass the input table as output unchanged.
(Inherited from BaseFilterTOptions, TFilter.) | |
Columns |
Gets the collection of filter options.
(Inherited from BaseFilterTOptions, TFilter.) | |
DefaultMissingValueReplacement |
Gets or sets the default value to be used as a replacement for missing values.
Default is to use System.DBNull.Value.
| |
ItemInt32 |
Gets options associated with a given variable (data column).
(Inherited from BaseFilterTOptions, TFilter.) | |
ItemString |
Gets options associated with a given variable (data column).
(Inherited from BaseFilterTOptions, TFilter.) | |
NumberOfInputs |
Gets the number of inputs accepted by the model.
(Inherited from BaseFilterTOptions, TFilter.) | |
NumberOfOutputs |
Gets the number of outputs generated by the model.
| |
Token |
Gets or sets a cancellation token that can be used to
stop the learning algorithm while it is running.
(Inherited from BaseFilterTOptions, TFilter.) |
Name | Description | |
---|---|---|
Add(TOptions) |
Add a new column options definition to the collection.
(Inherited from BaseFilterTOptions, TFilter.) | |
Add(CodificationVariable) |
Adds a new column options to this filter's collection,
specifying how a particular column should be processed by the filter..
| |
Add(String, CodificationVariable) |
Adds a new column options to this filter's collection,
specifying how a particular column should be processed by the filter..
| |
Add(String, CodificationVariable, T) |
Adds a new column options to this filter's collection,
specifying how a particular column should be processed by the filter..
| |
Add(String, CodificationVariable, T) |
Adds a new column options to this filter's collection,
specifying how a particular column should be processed by the filter..
| |
Apply(DataTable) |
Applies the Filter to a DataTable.
(Inherited from BaseFilterTOptions, TFilter.) | |
Apply(DataTable, String) |
Applies the Filter to a DataTable.
(Inherited from BaseFilterTOptions, TFilter.) | |
Equals | Determines whether the specified object is equal to the current object. (Inherited from Object.) | |
Finalize | Allows an object to try to free resources and perform other cleanup operations before it is reclaimed by garbage collection. (Inherited from Object.) | |
GetEnumerator |
Returns an enumerator that iterates through the collection.
(Inherited from BaseFilterTOptions, TFilter.) | |
GetHashCode | Serves as the default hash function. (Inherited from Object.) | |
GetType | Gets the Type of the current instance. (Inherited from Object.) | |
Learn(DataTable, Double) |
Learns a model that can map the given inputs to the desired outputs.
| |
Learn(T, Double) |
Learns a model that can map the given inputs to the desired outputs.
| |
Learn(T, Double) |
Learns a model that can map the given inputs to the desired outputs.
| |
MemberwiseClone | Creates a shallow copy of the current Object. (Inherited from Object.) | |
OnAddingOptions |
Called when a new column options definition is being added.
Can be used to validate or modify these options beforehand.
(Overrides BaseFilterTOptions, TFilterOnAddingOptions(TOptions).) | |
ProcessFilter |
Processes the current filter.
(Overrides BaseFilterTOptions, TFilterProcessFilter(DataTable).) | |
Revert(Int32) |
Translates an integer (codeword) representation of
the value of a given variable into its original
value.
| |
Revert(String, Int32) |
Translates an integer (codeword) representation of
the value of a given variable into its original
value.
| |
Revert(String, Int32) |
Translates an integer (codeword) representation of
the value of a given variable into its original
value.
| |
Revert(String, Int32) |
Translates the integer (codeword) representations of
the values of the given variables into their original
values.
| |
ToDouble |
Converts this instance into a transform that can generate double[].
| |
ToString | Returns a string that represents the current object. (Inherited from Object.) | |
Transform(T) |
Translates an array of values into their
integer representation, assuming values
are given in original order of columns.
| |
Transform(T) |
Translates a value of the given variables
into their integer (codeword) representation.
| |
Transform(DataRow, String) |
Translates an array of values into their
integer representation, assuming values
are given in original order of columns.
| |
Transform(DataTable, String) |
Translates an array of values into their
integer representation, assuming values
are given in original order of columns.
| |
Transform(String, T) |
Translates a value of a given variable
into its integer (codeword) representation.
| |
Transform(String, T) |
Translates a value of the given variables
into their integer (codeword) representation.
| |
Transform(String, T) |
Translates a value of the given variables
into their integer (codeword) representation.
| |
Transform(String, T) |
Translates a value of the given variables
into their integer (codeword) representation.
| |
Transform(T, Double) |
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
| |
Transform(T, Int32) |
Translates a value of the given variables
into their integer (codeword) representation.
| |
Transform(DataRow, String, String) |
Translates an array of values into their
integer representation, assuming values
are given in original order of columns.
| |
Transform(DataTable, String, String) |
Translates an array of values into their
integer representation, assuming values
are given in original order of columns.
|
Name | Description | |
---|---|---|
HasMethod |
Checks whether an object implements a method with the given name.
(Defined by ExtensionMethods.) | |
IsEqual |
Compares two objects for equality, performing an elementwise
comparison if the elements are vectors or matrices.
(Defined by Matrix.) | |
To(Type) | Overloaded.
Converts an object into another type, irrespective of whether
the conversion can be done at compile time or not. This can be
used to convert generic types to numeric types during runtime.
(Defined by ExtensionMethods.) | |
ToT | Overloaded.
Converts an object into another type, irrespective of whether
the conversion can be done at compile time or not. This can be
used to convert generic types to numeric types during runtime.
(Defined by ExtensionMethods.) |
The codification filter performs an integer codification of classes in given in a string form. An unique integer identifier will be assigned for each of the string classes.
Every Learn() method in the framework expects the class labels to be contiguous and zero-indexed, meaning that if there is a classification problem with n classes, all class labels must be numbers ranging from 0 to n-1. However, not every dataset might be in this format and sometimes we will have to pre-process the data to be in this format. The example below shows how to use the Codification class to perform such pre-processing.
// Let's say we have the following data to be classified // into three possible classes. Those are the samples: // double[][] inputs = { // input output new double[] { 0, 1, 1, 0 }, // 0 new double[] { 0, 1, 0, 0 }, // 0 new double[] { 0, 0, 1, 0 }, // 0 new double[] { 0, 1, 1, 0 }, // 0 new double[] { 0, 1, 0, 0 }, // 0 new double[] { 1, 0, 0, 0 }, // 1 new double[] { 1, 0, 0, 0 }, // 1 new double[] { 1, 0, 0, 1 }, // 1 new double[] { 0, 0, 0, 1 }, // 1 new double[] { 0, 0, 0, 1 }, // 1 new double[] { 1, 1, 1, 1 }, // 2 new double[] { 1, 0, 1, 1 }, // 2 new double[] { 1, 1, 0, 1 }, // 2 new double[] { 0, 1, 1, 1 }, // 2 new double[] { 1, 1, 1, 1 }, // 2 }; // Now, suppose that our class labels are not contiguous. We // have 3 classes, but they have the class labels 5, 1, and 8 // respectively. In this case, we can use a Codification filter // to obtain a contiguous zero-indexed labeling before learning int[] output_labels = { 5, 5, 5, 5, 5, 1, 1, 1, 1, 1, 8, 8, 8, 8, 8, }; // Create a codification object to obtain a output mapping var codebook = new Codification<int>().Learn(output_labels); // Transform the original labels using the codebook int[] outputs = codebook.Transform(output_labels); // Create the multi-class learning algorithm for the machine var teacher = new MulticlassSupportVectorLearning<Gaussian>() { // Configure the learning algorithm to use SMO to train the // underlying SVMs in each of the binary class subproblems. Learner = (param) => new SequentialMinimalOptimization<Gaussian>() { // Estimate a suitable guess for the Gaussian kernel's parameters. // This estimate can serve as a starting point for a grid search. UseKernelEstimation = true } }; // The following line is only needed to ensure reproducible results. Please remove it to enable full parallelization teacher.ParallelOptions.MaxDegreeOfParallelism = 1; // (Remove, comment, or change this line to enable full parallelism) // Learn a machine var machine = teacher.Learn(inputs, outputs); // Obtain class predictions for each sample int[] predicted = machine.Decide(inputs); // Translate the integers back to the original lagbels int[] predicted_labels = codebook.Revert(predicted);
Most classifiers in the framework also expect the input data to be of the same nature, i.e. continuous. The codification filter can also be used to convert discrete, categorical, ordinal and baseline categorical variables into continuous vectors that can be fed to other machine learning algorithms, such as K-Means:.
Accord.Math.Random.Generator.Seed = 0; // Declare some mixed discrete and continuous observations double[][] observations = { // (categorical) (discrete) (continuous) new double[] { 1, -1, -2.2 }, new double[] { 1, -6, -5.5 }, new double[] { 2, 1, 1.1 }, new double[] { 2, 2, 1.2 }, new double[] { 2, 2, 2.6 }, new double[] { 3, 2, 1.4 }, new double[] { 3, 4, 5.2 }, new double[] { 1, 6, 5.1 }, new double[] { 1, 6, 5.9 }, }; // Create a new codification algorithm to convert // the mixed variables above into all continuous: var codification = new Codification<double>() { CodificationVariable.Categorical, CodificationVariable.Discrete, CodificationVariable.Continuous }; // Learn the codification from observations var model = codification.Learn(observations); // Transform the mixed observations into only continuous: double[][] newObservations = model.ToDouble().Transform(observations); // (newObservations will be equivalent to) double[][] expected = { // (one hot) (discrete) (continuous) new double[] { 1, 0, 0, -1, -2.2 }, new double[] { 1, 0, 0, -6, -5.5 }, new double[] { 0, 1, 0, 1, 1.1 }, new double[] { 0, 1, 0, 2, 1.2 }, new double[] { 0, 1, 0, 2, 2.6 }, new double[] { 0, 0, 1, 2, 1.4 }, new double[] { 0, 0, 1, 4, 5.2 }, new double[] { 1, 0, 0, 6, 5.1 }, new double[] { 1, 0, 0, 6, 5.9 }, }; // Create a new K-Means algorithm KMeans kmeans = new KMeans(k: 3); // Compute and retrieve the data centroids var clusters = kmeans.Learn(observations); // Use the centroids to parition all the data int[] labels = clusters.Decide(observations);
For more examples, please see the documentation page for the non-generic Codification filter.