|
|
BagOfVisualWords Class |
Namespace: Accord.Imaging
[SerializableAttribute] public class BagOfVisualWords : BaseBagOfVisualWords<BagOfVisualWords, SpeededUpRobustFeaturePoint, double[], IUnsupervisedLearning<IClassifier<double[], int>, double[], int>, SpeededUpRobustFeaturesDetector>
The BagOfVisualWords type exposes the following members.
| Name | Description | |
|---|---|---|
| BagOfVisualWords(Int32) |
Constructs a new BagOfVisualWords using a
surf
feature detector to identify features.
| |
| BagOfVisualWords(IUnsupervisedLearningIClassifierDouble, Int32, Double, Int32) |
Constructs a new BagOfVisualWords using a
surf
feature detector to identify features.
|
| Name | Description | |
|---|---|---|
| Clustering |
Gets the clustering algorithm used to create this model.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) | |
| Detector |
Gets the SURF
feature point detector used to identify visual features in images.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) | |
| NumberOfInputs |
Gets the number of inputs accepted by the model.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) | |
| NumberOfOutputs |
Gets the number of outputs generated by the model.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) | |
| NumberOfWords |
Gets the number of words in this codebook.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) | |
| ParallelOptions |
Gets or sets the parallelization options for this algorithm.
(Inherited from ParallelLearningBase.) | |
| Token |
Gets or sets a cancellation token that can be used
to cancel the algorithm while it is running.
(Inherited from ParallelLearningBase.) |
| Name | Description | |
|---|---|---|
| Compute(Bitmap) | Obsolete.
Computes the Bag of Words model.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) | |
| Compute(TFeature) | Obsolete.
Computes the Bag of Words model.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) | |
| Compute(Bitmap, Double) | Obsolete.
Computes the Bag of Words model.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) | |
| Create(Int32) |
Creates a Bag-of-Words model using SURF and K-Means.
| |
| CreateTClustering(TClustering) |
Creates a Bag-of-Words model using the SURF feature detector and the given clustering algorithm.
| |
| CreateTDetector, TClustering(TDetector, Int32) |
Creates a Bag-of-Words model using the given feature detector and K-Means.
| |
| CreateTDetector, TClustering(TDetector, TClustering) |
Creates a Bag-of-Words model using the given feature detector and clustering algorithm.
| |
| CreateTDetector, TClustering, TFeature(TDetector, TClustering) |
Creates a Bag-of-Words model using the given feature detector and clustering algorithm.
| |
| CreateTDetector, TClustering, TPoint, TFeature(TDetector, TClustering) |
Creates a Bag-of-Words model using the given feature detector and clustering algorithm.
| |
| Equals | Determines whether the specified object is equal to the current object. (Inherited from Object.) | |
| Finalize | Allows an object to try to free resources and perform other cleanup operations before it is reclaimed by garbage collection. (Inherited from Object.) | |
| GetFeatureVector(ListTPoint) | Obsolete.
Gets the codeword representation of a given image.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) | |
| GetFeatureVector(Bitmap) | Obsolete.
Gets the codeword representation of a given image.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) | |
| GetFeatureVector(UnmanagedImage) | Obsolete.
Gets the codeword representation of a given image.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) | |
| GetHashCode | Serves as the default hash function. (Inherited from Object.) | |
| GetType | Gets the Type of the current instance. (Inherited from Object.) | |
| Init |
Initializes this instance.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) | |
| Learn(Bitmap, Double) |
Learns a model that can map the given inputs to the desired outputs.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) | |
| Learn(TFeature, Double) |
Learns a model that can map the given inputs to the desired outputs.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) | |
| Learn(UnmanagedImage, Double) |
Learns a model that can map the given inputs to the desired outputs.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) | |
| Load(Stream) | Obsolete.
Loads a bag of words from a stream.
| |
| Load(String) | Obsolete.
Loads a bag of words from a file.
| |
| LoadTPoint(Stream) | Obsolete.
Loads a bag of words from a stream.
| |
| LoadTPoint(String) | Obsolete.
Loads a bag of words from a file.
| |
| LoadTPoint, TFeature(Stream) | Obsolete.
Loads a bag of words from a stream.
| |
| LoadTPoint, TFeature(String) | Obsolete.
Loads a bag of words from a file.
| |
| MemberwiseClone | Creates a shallow copy of the current Object. (Inherited from Object.) | |
| Save(Stream) | Obsolete.
Saves the bag of words to a stream.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) | |
| Save(String) | Obsolete.
Saves the bag of words to a file.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) | |
| ToString | Returns a string that represents the current object. (Inherited from Object.) | |
| Transform(ListTPoint) |
Applies the transformation to an input, producing an associated output.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) | |
| Transform(Bitmap) |
Applies the transformation to an input, producing an associated output.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) | |
| Transform(Bitmap) |
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) | |
| Transform(UnmanagedImage) |
Applies the transformation to an input, producing an associated output.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) | |
| Transform(UnmanagedImage) |
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) | |
| Transform(IListTPoint, Double) |
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) | |
| Transform(IListTPoint, Int32) |
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) | |
| Transform(Bitmap, Double) |
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) | |
| Transform(Bitmap, Int32) |
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) | |
| Transform(Bitmap, Double) |
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) | |
| Transform(Bitmap, Int32) |
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) | |
| Transform(UnmanagedImage, Double) |
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) | |
| Transform(UnmanagedImage, Int32) |
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) | |
| Transform(UnmanagedImage, Double) |
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) | |
| Transform(UnmanagedImage, Int32) |
Applies the transformation to a set of input vectors,
producing an associated set of output vectors.
(Inherited from BaseBagOfVisualWordsTModel, TPoint, TFeature, TClustering, TDetector.) |
| Name | Description | |
|---|---|---|
| HasMethod |
Checks whether an object implements a method with the given name.
(Defined by ExtensionMethods.) | |
| IsEqual | Compares two objects for equality, performing an elementwise comparison if the elements are vectors or matrices. (Defined by Matrix.) | |
| ToT |
Converts an object into another type, irrespective of whether
the conversion can be done at compile time or not. This can be
used to convert generic types to numeric types during runtime.
(Defined by ExtensionMethods.) |
The bag-of-words (BoW) model can be used to transform data with multiple possible lengths (i.e. words in a text, pixels in an image) into finite-dimensional vectors of fixed length. Those vectors are usually referred as representations as they can be used in place of the original data as if they were the data itself. For example, using Bag-of-Words it becomes possible to transform a set of N images with varying sizes and dimensions into a N x C matrix where C is the number of "visual words" being used to represent each of the N images in the set.
Those rows can then be used in classification, clustering, and any other machine learning tasks where a finite vector representation would be required.
The framework can compute BoW representations for images using any choice of feature extractor and clustering algorithm. By default, the framework uses the SURF features detector and the KMeans clustering algorithm.
The first example shows how to create and use a BoW with default parameters.
// Ensure results are reproducible Accord.Math.Random.Generator.Seed = 0; // The Bag-of-Visual-Words model converts images of arbitrary // size into fixed-length feature vectors. In this example, we // will be setting the codebook size to 10. This means all feature // vectors that will be generated will have the same length of 10. // By default, the BoW object will use the sparse SURF as the // feature extractor and K-means as the clustering algorithm. // Create a new Bag-of-Visual-Words (BoW) model BagOfVisualWords bow = new BagOfVisualWords(10); // Note: the BoW model can also be created using // var bow = BagOfVisualWords.Create(10); // Get some training images Bitmap[] images = GetImages(); // Compute the model bow.Learn(images); // After this point, we will be able to translate // images into double[] feature vectors using double[][] features = bow.Transform(images);
After the representations have been extracted, it is possible to use them in arbitrary machine learning tasks, such as classification:
// Now, the features can be used to train any classification // algorithm as if they were the images themselves. For example, // let's assume the first three images belong to a class and // the second three to another class. We can train an SVM using int[] labels = { -1, -1, -1, +1, +1, +1 }; // Create the SMO algorithm to learn a Linear kernel SVM var teacher = new SequentialMinimalOptimization<Linear>() { Complexity = 10000 // make a hard margin SVM }; // Obtain a learned machine var svm = teacher.Learn(features, labels); // Use the machine to classify the features bool[] output = svm.Decide(features); // Compute the error between the expected and predicted labels double error = new ZeroOneLoss(labels).Loss(output);
By default, the BoW uses K-Means to cluster feature vectors. The next example demonstrates how to use a different clustering algorithm when computing the BoW, including the Binary Split algorithm.
// Ensure results are reproducible Accord.Math.Random.Generator.Seed = 0; // The Bag-of-Visual-Words model converts images of arbitrary // size into fixed-length feature vectors. In this example, we // will be setting the codebook size to 10. This means all feature // vectors that will be generated will have the same length of 10. // By default, the BoW object will use the sparse SURF as the // feature extractor and K-means as the clustering algorithm. // In this example, we will use the Binary-Split clustering // algorithm instead. // Create a new Bag-of-Visual-Words (BoW) model var bow = BagOfVisualWords.Create(new BinarySplit(10)); // Since we are using generics, we can setup properties // of the binary split clustering algorithm directly: bow.Clustering.ComputeProportions = true; bow.Clustering.ComputeCovariances = false; // Get some training images Bitmap[] images = GetImages(); // Compute the model bow.Learn(images); // After this point, we will be able to translate // images into double[] feature vectors using double[][] features = bow.Transform(images);
// Now, the features can be used to train any classification // algorithm as if they were the images themselves. For example, // let's assume the first three images belong to a class and // the second three to another class. We can train an SVM using int[] labels = { -1, -1, -1, +1, +1, +1 }; // Create the SMO algorithm to learn a Linear kernel SVM var teacher = new SequentialMinimalOptimization<Linear>() { Complexity = 10000 // make a hard margin SVM }; // Obtain a learned machine var svm = teacher.Learn(features, labels); // Use the machine to classify the features bool[] output = svm.Decide(features); // Compute the error between the expected and predicted labels double error = new ZeroOneLoss(labels).Loss(output); // should be 0
By default, the BoW uses the SURF feature detector to extract sparse features from the images. However, it is also possible to use other detectors, including dense detectors such as HistogramsOfOrientedGradients.
Accord.Math.Random.Generator.Seed = 0; // The Bag-of-Visual-Words model converts images of arbitrary // size into fixed-length feature vectors. In this example, we // will be setting the codebook size to 10. This means all feature // vectors that will be generated will have the same length of 10. // By default, the BoW object will use the sparse SURF as the // feature extractor and K-means as the clustering algorithm. // In this example, we will use the HOG feature extractor // and the Binary-Split clustering algorithm instead. // Create a new Bag-of-Visual-Words (BoW) model using HOG features var bow = BagOfVisualWords.Create(new HistogramsOfOrientedGradients(), new BinarySplit(10)); // Get some training images Bitmap[] images = GetImages(); // Compute the model bow.Learn(images); // After this point, we will be able to translate // images into double[] feature vectors using double[][] features = bow.Transform(images);
// Now, the features can be used to train any classification // algorithm as if they were the images themselves. For example, // let's assume the first three images belong to a class and // the second three to another class. We can train an SVM using int[] labels = { -1, -1, -1, +1, +1, +1 }; // Create the SMO algorithm to learn a Linear kernel SVM var teacher = new SequentialMinimalOptimization<Linear>() { Complexity = 100 // make a hard margin SVM }; // Obtain a learned machine var svm = teacher.Learn(features, labels); // Use the machine to classify the features bool[] output = svm.Decide(features); // Compute the error between the expected and predicted labels double error = new ZeroOneLoss(labels).Loss(output); // should be 0
More advanced use cases are also supported. For example, some image patches can be represented using different data representations, such as byte vectors. In this case, it is still possible to use the BoW using an appropriate clustering algorithm that doesn't depend on Euclidean distances.
Accord.Math.Random.Generator.Seed = 0; // The Bag-of-Visual-Words model converts images of arbitrary // size into fixed-length feature vectors. In this example, we // will be setting the codebook size to 10. This means all feature // vectors that will be generated will have the same length of 10. // By default, the BoW object will use the sparse SURF as the // feature extractor and K-means as the clustering algorithm. // In this example, we will use the FREAK feature extractor // and the K-Modes clustering algorithm instead. // Create a new Bag-of-Visual-Words (BoW) model using FREAK binary features var bow = BagOfVisualWords.Create<FastRetinaKeypointDetector, KModes<byte>, byte[]>( new FastRetinaKeypointDetector(), new KModes<byte>(10, new Hamming())); // Get some training images Bitmap[] images = GetImages(); // Compute the model bow.Learn(images); // After this point, we will be able to translate // images into double[] feature vectors using double[][] features = bow.Transform(images);
// Now, the features can be used to train any classification // algorithm as if they were the images themselves. For example, // let's assume the first three images belong to a class and // the second three to another class. We can train an SVM using int[] labels = { -1, -1, -1, +1, +1, +1 }; // Create the SMO algorithm to learn a Linear kernel SVM var teacher = new SequentialMinimalOptimization<Linear>() { Complexity = 1000 // make a hard margin SVM }; // Obtain a learned machine var svm = teacher.Learn(features, labels); // Use the machine to classify the features bool[] output = svm.Decide(features); // Compute the error between the expected and predicted labels double error = new ZeroOneLoss(labels).Loss(output); // should be 0