HiddenConditionalRandomFieldT Class |
Namespace: Accord.Statistics.Models.Fields
[SerializableAttribute] public class HiddenConditionalRandomField<T> : MulticlassClassifierBase<T[]>, ICloneable
The HiddenConditionalRandomFieldT type exposes the following members.
Name | Description | |
---|---|---|
HiddenConditionalRandomFieldT |
Initializes a new instance of the HiddenConditionalRandomFieldT class.
| |
HiddenConditionalRandomFieldT(IPotentialFunctionT) |
Initializes a new instance of the HiddenConditionalRandomFieldT class.
|
Name | Description | |
---|---|---|
Function |
Gets the potential function encompassing
all feature functions for this model.
| |
NumberOfClasses |
Gets the number of classes expected and recognized by the classifier.
(Inherited from ClassifierBaseTInput, TClasses.) | |
NumberOfInputs |
Gets the number of inputs accepted by the model.
(Inherited from TransformBaseTInput, TOutput.) | |
NumberOfOutputs |
Gets the number of outputs generated by the model.
(Inherited from TransformBaseTInput, TOutput.) | |
Outputs | Obsolete.
Gets the number of outputs assumed by the model.
|
Name | Description | |
---|---|---|
Clone |
Creates a new object that is a copy of the current instance.
| |
Compute(T) | Obsolete.
Computes the most likely output for the given observations.
| |
Compute(T, Double) | Obsolete.
Computes the most likely output for the given observations.
| |
Compute(T, Double) | Obsolete.
Computes the most likely output for the given observations.
| |
Decide(TInput) |
Computes class-label decisions for a given set of input vectors.
(Inherited from ClassifierBaseTInput, TClasses.) | |
Decide(T) |
Computes a class-label decision for a given input.
(Overrides ClassifierBaseTInput, TClassesDecide(TInput).) | |
Decide(TInput, TClasses) |
Computes a class-label decision for a given input.
(Inherited from ClassifierBaseTInput, TClasses.) | |
Decide(TInput, Boolean) |
Computes class-label decisions for the given input.
(Inherited from MulticlassClassifierBaseTInput.) | |
Decide(TInput, Double) |
Computes class-label decisions for the given input.
(Inherited from MulticlassClassifierBaseTInput.) | |
Decide(TInput, Int32) |
Computes class-label decisions for the given input.
(Inherited from MulticlassClassifierBaseTInput.) | |
Decide(TInput, Double) |
Computes a class-label decision for a given input.
(Inherited from MulticlassClassifierBaseTInput.) | |
Decode(T, Int32) |
Computes the most likely state labels for the given observations,
returning the overall sequence probability for this model.
| |
Decode(T, Int32, Double) |
Computes the most likely state labels for the given observations,
returning the overall sequence probability for this model.
| |
Equals | Determines whether the specified object is equal to the current object. (Inherited from Object.) | |
Finalize | Allows an object to try to free resources and perform other cleanup operations before it is reclaimed by garbage collection. (Inherited from Object.) | |
GetHashCode | Serves as the default hash function. (Inherited from Object.) | |
GetType | Gets the Type of the current instance. (Inherited from Object.) | |
Load(Stream) | Obsolete.
Loads a random field from a stream.
| |
Load(String) | Obsolete.
Loads a random field from a file.
| |
LogLikelihood(T) |
Computes the log-likelihood of the observations given this model.
| |
LogLikelihood(T) |
Computes the log-likelihood of the observations given this model.
| |
LogLikelihood(T, Int32) |
Computes the log-likelihood that the given
observations belong to the desired output.
| |
LogLikelihood(T, Int32) |
Computes the log-likelihood that the given
observations belong to the desired outputs.
| |
LogLikelihood(T, Int32, Double) |
Computes the log-likelihood that the given
observations belong to the desired output.
| |
LogLikelihood(T, Int32, Double) |
Computes the log-likelihood that the given
observations belong to the desired outputs.
| |
LogPartition(T) |
Computes the log-partition function ln Z(x).
| |
LogPartition(T, Int32) |
Computes the log-partition function ln Z(x,y).
| |
MemberwiseClone | Creates a shallow copy of the current Object. (Inherited from Object.) | |
Partition(T) |
Computes the partition function Z(x).
| |
Partition(T, Int32) |
Computes the partition function Z(x,y).
| |
Save(Stream) | Obsolete.
Saves the random field to a stream.
| |
Save(String) | Obsolete.
Saves the random field to a stream.
| |
ToMultilabel |
Views this instance as a multi-label classifier,
giving access to more advanced methods, such as the prediction
of one-hot vectors.
(Inherited from MulticlassClassifierBaseTInput.) | |
ToString | Returns a string that represents the current object. (Inherited from Object.) | |
Transform(TInput) |
Applies the transformation to an input, producing an associated output.
(Inherited from ClassifierBaseTInput, TClasses.) | |
Transform(TInput) |
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
(Inherited from TransformBaseTInput, TOutput.) | |
Transform(TInput, TClasses) |
Applies the transformation to an input, producing an associated output.
(Inherited from ClassifierBaseTInput, TClasses.) | |
Transform(TInput, Boolean) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBaseTInput.) | |
Transform(TInput, Double) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBaseTInput.) | |
Transform(TInput, Int32) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBaseTInput.) | |
Transform(TInput, Boolean) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBaseTInput.) | |
Transform(TInput, Double) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBaseTInput.) | |
Transform(TInput, Double) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBaseTInput.) | |
Transform(TInput, Int32) |
Applies the transformation to an input, producing an associated output.
(Inherited from MulticlassClassifierBaseTInput.) |
Name | Description | |
---|---|---|
HasMethod |
Checks whether an object implements a method with the given name.
(Defined by ExtensionMethods.) | |
IsEqual |
Compares two objects for equality, performing an elementwise
comparison if the elements are vectors or matrices.
(Defined by Matrix.) | |
To(Type) | Overloaded.
Converts an object into another type, irrespective of whether
the conversion can be done at compile time or not. This can be
used to convert generic types to numeric types during runtime.
(Defined by ExtensionMethods.) | |
ToT | Overloaded.
Converts an object into another type, irrespective of whether
the conversion can be done at compile time or not. This can be
used to convert generic types to numeric types during runtime.
(Defined by ExtensionMethods.) |
Conditional random fields (CRFs) are a class of statistical modeling method often applied in pattern recognition and machine learning, where they are used for structured prediction. Whereas an ordinary classifier predicts a label for a single sample without regard to "neighboring" samples, a CRF can take context into account; e.g., the linear chain CRF popular in natural language processing predicts sequences of labels for sequences of input samples.
While Conditional Random Fields can be seen as a generalization of Markov models, Hidden Conditional Random Fields can be seen as a generalization of Hidden Markov Model Classifiers. The (linear-chain) Conditional Random Field is the discriminative counterpart of the Markov model. An observable Markov Model assumes the sequences of states y to be visible, rather than hidden. Thus they can be used in a different set of problems than the hidden Markov models. Those models are often used for sequence component labeling, also known as part-of-sequence tagging. After a model has been trained, they are mostly used to tag parts of a sequence using the Viterbi algorithm. This is very handy to perform, for example, classification of parts of a speech utterance, such as classifying phonemes inside an audio signal.
References:
// Let's say we would like to do a very simple mechanism for gesture recognition. // In this example, we will be trying to create a classifier that can distinguish // between the words "hello", "car", and "wardrobe". // Let's say we decided to acquire some data, and we asked some people to perform // those words in front of a Kinect camera, and, using Microsoft's SDK, we were able // to captured the x and y coordinates of each hand while the word was being performed. // Let's say we decided to represent our frames as: // // double[] frame = { leftHandX, leftHandY, rightHandX, rightHandY }; // 4 dimensions // // Since we captured words, this means we captured sequences of frames as we described // above. Let's write some of those as rough examples to explain how gesture recognition // can be done: double[][] hello = { new double[] { 1.0, 0.1, 0.0, 0.0 }, // let's say the word new double[] { 0.0, 1.0, 0.1, 0.1 }, // hello took 6 frames new double[] { 0.0, 1.0, 0.1, 0.1 }, // to be recorded. new double[] { 0.0, 0.0, 1.0, 0.0 }, new double[] { 0.0, 0.0, 1.0, 0.0 }, new double[] { 0.0, 0.0, 0.1, 1.1 }, }; double[][] car = { new double[] { 0.0, 0.0, 0.0, 1.0 }, // the car word new double[] { 0.1, 0.0, 1.0, 0.1 }, // took only 4. new double[] { 0.0, 0.0, 0.1, 0.0 }, new double[] { 1.0, 0.0, 0.0, 0.0 }, }; double[][] wardrobe = { new double[] { 0.0, 0.0, 1.0, 0.0 }, // same for the new double[] { 0.1, 0.0, 1.0, 0.1 }, // wardrobe word. new double[] { 0.0, 0.1, 1.0, 0.0 }, new double[] { 0.1, 0.0, 1.0, 0.1 }, }; // Please note that a real-world example would involve *lots* of samples for each word. // Here, we are considering just one from each class which is clearly sub-optimal and // should _never_ be done on practice. Please keep in mind that we are doing like this // only to simplify this example on how to create and use HCRFs. // These are the words we have in our vocabulary: double[][][] words = { hello, car, wardrobe }; // Now, let's associate integer labels with them. This is needed // for the case where there are multiple samples for each word. int[] labels = { 0, 1, 2 }; // Create a new learning algorithm to train the hidden Markov model sequence classifier var teacher = new HiddenMarkovClassifierLearning<Independent<NormalDistribution>, double[]>() { // Train each model until the log-likelihood changes less than 0.001 Learner = (i) => new BaumWelchLearning<Independent<NormalDistribution>, double[]>() { Topology = new Forward(5), // this value can be found by trial-and-error // We will create our classifiers assuming an independent Gaussian distribution // for each component in our feature vectors (assuming a Naive Bayes assumption). Emissions = (s) => new Independent<NormalDistribution>(dimensions: 4), // 4 dimensions Tolerance = 0.001, Iterations = 100, // This is necessary so the code doesn't blow up when it realizes there is only one // sample per word class. But this could also be needed in normal situations as well: FittingOptions = new IndependentOptions() { InnerOption = new NormalOptions() { Regularization = 1e-5 } } } }; // PS: In case you find exceptions trying to configure your model, you might want // to try disabling parallel processing to get more descriptive error messages: // teacher.ParallelOptions.MaxDegreeOfParallelism = 1; // Finally, we can run the learning algorithm! var hmm = teacher.Learn(words, labels); double logLikelihood = teacher.LogLikelihood; // At this point, the classifier should be successfully // able to distinguish between our three word classes: // int tc1 = hmm.Decide(hello); // should be 0 int tc2 = hmm.Decide(car); // should be 1 int tc3 = hmm.Decide(wardrobe); // should be 2
// Now, we can use the Markov classifier to initialize a HCRF var baseline = HiddenConditionalRandomField.FromHiddenMarkov(hmm); // We can check that both are equivalent, although they have // formulations that can be learned with different methods: int[] predictedLabels = baseline.Decide(words);
// Now we can learn the HCRF using one of the best learning // algorithms available, Resilient Backpropagation learning: // Create the Resilient Backpropagation learning algorithm var rprop = new HiddenResilientGradientLearning<double[]>() { Function = baseline.Function, // use the same HMM function Iterations = 50, Tolerance = 1e-5 }; // Run the algorithm and learn the models var hcrf = rprop.Learn(words, labels); // At this point, the HCRF should be successfully // able to distinguish between our three word classes: // int hc1 = hcrf.Decide(hello); // should be 0 int hc2 = hcrf.Decide(car); // should be 1 int hc3 = hcrf.Decide(wardrobe); // should be 2
The next example shows how to use the learning algorithms in a real-world dataset, including training and testing in separate sets and evaluating its performance:
// Ensure we get reproducible results Accord.Math.Random.Generator.Seed = 0; // Download the PENDIGITS dataset from UCI ML repository var pendigits = new Pendigits(path: localDownloadPath); // Get and pre-process the training set double[][][] trainInputs = pendigits.Training.Item1; int[] trainOutputs = pendigits.Training.Item2; // Pre-process the digits so each of them is centered and scaled trainInputs = trainInputs.Apply(Accord.Statistics.Tools.ZScores); trainInputs = trainInputs.Apply((x) => x.Subtract(x.Min())); // make them positive // Create some prior distributions to help initialize our parameters var priorC = new WishartDistribution(dimension: 2, degreesOfFreedom: 5); var priorM = new MultivariateNormalDistribution(dimension: 2); // Create a new learning algorithm for creating continuous hidden Markov model classifiers var teacher1 = new HiddenMarkovClassifierLearning<MultivariateNormalDistribution, double[]>() { // This tells the generative algorithm how to train each of the component models. Note: The learning // algorithm is more efficient if all generic parameters are specified, including the fitting options Learner = (i) => new BaumWelchLearning<MultivariateNormalDistribution, double[], NormalOptions>() { Topology = new Forward(5), // Each model will have a forward topology with 5 states // Their emissions will be multivariate Normal distributions initialized using the prior distributions Emissions = (j) => new MultivariateNormalDistribution(mean: priorM.Generate(), covariance: priorC.Generate()), // We will train until the relative change in the average log-likelihood is less than 1e-6 between iterations Tolerance = 1e-6, MaxIterations = 1000, // or until we perform 1000 iterations (which is unlikely for this dataset) // We will prevent our covariance matrices from becoming degenerate by adding a small // regularization value to their diagonal until they become positive-definite again: FittingOptions = new NormalOptions() { Regularization = 1e-6 } } }; //// The following line is only needed to ensure reproducible results. Please remove it to enable full parallelization //teacher1.ParallelOptions.MaxDegreeOfParallelism = 1; // (Remove, comment, or change this line to enable full parallelism) // Use the learning algorithm to create a classifier var hmmc = teacher1.Learn(trainInputs, trainOutputs); // Create a new learning algorithm for creating HCRFs var teacher2 = new HiddenResilientGradientLearning<double[]>() { Function = new MarkovMultivariateFunction(hmmc), MaxIterations = 10 }; //// The following line is only needed to ensure reproducible results. Please remove it to enable full parallelization //teacher2.ParallelOptions.MaxDegreeOfParallelism = 1; // (Remove, comment, or change this line to enable full parallelism) // Use the learning algorithm to create a classifier var hcrf = teacher2.Learn(trainInputs, trainOutputs); // Compute predictions for the training set int[] trainPredicted = hcrf.Decide(trainInputs); // Check the performance of the classifier by comparing with the ground-truth: var m1 = new GeneralConfusionMatrix(predicted: trainPredicted, expected: trainOutputs); double trainAcc = m1.Accuracy; // should be 0.81532304173813608 // Prepare the testing set double[][][] testInputs = pendigits.Testing.Item1; int[] testOutputs = pendigits.Testing.Item2; // Apply the same normalizations testInputs = testInputs.Apply(Accord.Statistics.Tools.ZScores); testInputs = testInputs.Apply((x) => x.Subtract(x.Min())); // make them positive // Compute predictions for the test set int[] testPredicted = hcrf.Decide(testInputs); // Check the performance of the classifier by comparing with the ground-truth: var m2 = new GeneralConfusionMatrix(predicted: testPredicted, expected: testOutputs); double testAcc = m2.Accuracy; // should be 0.77061649319455561