Click or drag to resize
Accord.NET (logo)

GridSearch< TInput, TOutput> Class

Grid search procedure for automatic parameter tuning.
Inheritance Hierarchy
System.Object
  Accord.MachineLearning.Performance.GridSearch<TInput, TOutput>

Namespace:  Accord.MachineLearning.Performance
Assembly:  Accord.MachineLearning (in Accord.MachineLearning.dll) Version: 3.8.0
Syntax
public static class GridSearch<TInput, TOutput>
Request Example View Source

Type Parameters

TInput
The type of the input data. Default is double[].
TOutput
The type of the output data. Default is int.
Methods
  NameDescription
Public methodStatic memberCode exampleCreate<TModel, TLearner>(GridSearchRange[], CreateLearnerFromParameter<TLearner, GridSearchParameterCollection>, ComputeLoss<TOutput, TModel>, LearnNewModel<TLearner, TInput, TOutput, TModel>)
Public methodStatic memberCode exampleCreate<TRange, TModel, TLearner>(TRange, CreateLearnerFromParameter<TLearner, TRange>, ComputeLoss<TOutput, TModel>, LearnNewModel<TLearner, TInput, TOutput, TModel>)
Public methodStatic memberCode exampleCrossValidate<TModel, TLearner>(GridSearchRange[], Func<GridSearchParameterCollection, DataSubset<TInput, TOutput>, TLearner>, ComputeLoss<TOutput, TModel>, LearnNewModel<TLearner, TInput, TOutput, TModel>, Int32)
Public methodStatic memberCode exampleCrossValidate<TRange, TModel, TLearner>(TRange, Func<TRange, DataSubset<TInput, TOutput>, TLearner>, ComputeLoss<TOutput, TModel>, LearnNewModel<TLearner, TInput, TOutput, TModel>, Int32)
Top
Remarks
Grid Search tries to find the best combination of parameters across a range of possible values that produces the best fit model. If there are two parameters, each with 10 possible values, Grid Search will try an exhaustive evaluation of the model using every combination of points, resulting in 100 model fits.
Examples

The framework offers different ways to use grid search: one version is strongly-typed using generics and the other might need some manual casting. The exapmle below shows how to perform grid-search in a non-stringly typed way:

// Ensure results are reproducible
Accord.Math.Random.Generator.Seed = 0;

// Example binary data
double[][] inputs =
{
    new double[] { -1, -1 },
    new double[] { -1,  1 },
    new double[] {  1, -1 },
    new double[] {  1,  1 }
};

int[] xor = // xor labels
{
    -1, 1, 1, -1
};

// Instantiate a new Grid Search algorithm for Kernel Support Vector Machines
var gridsearch = new GridSearch<SupportVectorMachine<Polynomial>, double[], int>()
{
    // Here we can specify the range of the parameters to be included in the search
    ParameterRanges = new GridSearchRangeCollection()
    {
        new GridSearchRange("complexity", new double[] { 0.00000001, 5.20, 0.30, 0.50 } ),
        new GridSearchRange("degree",     new double[] { 1, 10, 2, 3, 4, 5 } ),
        new GridSearchRange("constant",   new double[] { 0, 1, 2 } )
    },

    // Indicate how learning algorithms for the models should be created
    Learner = (p) => new SequentialMinimalOptimization<Polynomial>
    {
        Complexity = p["complexity"],
        Kernel = new Polynomial((int)p["degree"], p["constant"])
    },

    // Define how the performance of the models should be measured
    Loss = (actual, expected, m) => new ZeroOneLoss(expected).Loss(actual)
};

// If needed, control the degree of CPU parallelization
gridsearch.ParallelOptions.MaxDegreeOfParallelism = 1;

// Search for the best model parameters
var result = gridsearch.Learn(inputs, xor);

// Get the best SVM found during the parameter search
SupportVectorMachine<Polynomial> svm = result.BestModel;

// Get an estimate for its error:
double bestError = result.BestModelError;

// Get the best values found for the model parameters:
double bestC = result.BestParameters["complexity"].Value;
double bestDegree = result.BestParameters["degree"].Value;
double bestConstant = result.BestParameters["constant"].Value;

The main disadvantages of the method above is the need to keep string identifiers for each of the parameters being searched. Furthermore, it is also necessary to keep track of their types in order to cast them accordingly when using them in the specification of the Learner property.

The next example shows how to perform grid-search in a strongly typed way:

// Ensure results are reproducible
Accord.Math.Random.Generator.Seed = 0;

// This is a sample code showing how to use Grid-Search in combination with 
// Cross-Validation  to assess the performance of Support Vector Machines.

// Consider the example binary data. We will be trying to learn a XOR 
// problem and see how well does SVMs perform on this data.

double[][] inputs =
{
    new double[] { -1, -1 }, new double[] {  1, -1 },
    new double[] { -1,  1 }, new double[] {  1,  1 },
    new double[] { -1, -1 }, new double[] {  1, -1 },
    new double[] { -1,  1 }, new double[] {  1,  1 },
    new double[] { -1, -1 }, new double[] {  1, -1 },
    new double[] { -1,  1 }, new double[] {  1,  1 },
    new double[] { -1, -1 }, new double[] {  1, -1 },
    new double[] { -1,  1 }, new double[] {  1,  1 },
};

int[] xor = // result of xor for the sample input data
{
    -1,       1,
     1,      -1,
    -1,       1,
     1,      -1,
    -1,       1,
     1,      -1,
    -1,       1,
     1,      -1,
};

// Create a new Grid-Search with Cross-Validation algorithm. Even though the
// generic, strongly-typed approach used accross the framework is most of the
// time easier to handle, meta-algorithms such as grid-search can be a bit hard
// to setup. For this reason. the framework offers a specialized method for it:
var gridsearch = GridSearch<double[], int>.Create(

    // Here we can specify the range of the parameters to be included in the search
    ranges: new
    {
        Kernel = GridSearch.Values<IKernel>(new Linear(), new ChiSquare(), new Gaussian(), new Sigmoid()),
        Complexity = GridSearch.Values(0.00000001, 5.20, 0.30, 0.50),
        Tolerance = GridSearch.Range(1e-10, 1.0, stepSize: 0.05)
    },

    // Indicate how learning algorithms for the models should be created
    learner: (p) => new SequentialMinimalOptimization<IKernel>
    {
        Complexity = p.Complexity,
        Kernel = p.Kernel.Value,
        Tolerance = p.Tolerance
    },

    // Define how the model should be learned, if needed
    fit: (teacher, x, y, w) => teacher.Learn(x, y, w),

    // Define how the performance of the models should be measured
    loss: (actual, expected, m) => new ZeroOneLoss(expected).Loss(actual)
);

// If needed, control the degree of CPU parallelization
gridsearch.ParallelOptions.MaxDegreeOfParallelism = 1;

// Search for the best model parameters
var result = gridsearch.Learn(inputs, xor);

// Get the best SVM:
SupportVectorMachine<IKernel> svm = result.BestModel;

// Estimate its error:
double bestError = result.BestModelError;

// Get the best values for the parameters:
double bestC = result.BestParameters.Complexity;
double bestTolerance = result.BestParameters.Tolerance;
IKernel bestKernel = result.BestParameters.Kernel.Value;

The code above uses anonymous types and generics to create a specialized GridSearch< TModel, TInput, TOutput> class that keeps the anonymous type given as ParameterRanges. Its main disadvantage is the (high) increase in type complexity, making the use of the var keyword almost mandatory.

It is also possible to create grid-search objects using convenience methods from the static GridSearch class:

// Ensure results are reproducible
Accord.Math.Random.Generator.Seed = 0;

// Example binary data
double[][] inputs =
{
    new double[] { -1, -1 },
    new double[] { -1,  1 },
    new double[] {  1, -1 },
    new double[] {  1,  1 }
};

int[] xor = // xor labels
{
    -1, 1, 1, -1
};

// Instantiate a new Grid Search algorithm for Kernel Support Vector Machines
var gridsearch = GridSearch<double[], int>.Create(

    ranges: new GridSearchRange[]
    {
        new GridSearchRange("complexity", new double[] { 0.00000001, 5.20, 0.30, 0.50 } ),
        new GridSearchRange("degree",     new double[] { 1, 10, 2, 3, 4, 5 } ),
        new GridSearchRange("constant",   new double[] { 0, 1, 2 } )
    },

    learner: (p) => new SequentialMinimalOptimization<Polynomial>
    {
        Complexity = p["complexity"],
        Kernel = new Polynomial((int)p["degree"].Value, p["constant"])
    },

    // Define how the model should be learned, if needed
    fit: (teacher, x, y, w) => teacher.Learn(x, y, w),

    // Define how the performance of the models should be measured
    loss: (actual, expected, m) => new ZeroOneLoss(expected).Loss(actual)
);

// If needed, control the degree of CPU parallelization
gridsearch.ParallelOptions.MaxDegreeOfParallelism = 1;

// Search for the best model parameters
var result = gridsearch.Learn(inputs, xor);

// Get the best SVM generated during the search
SupportVectorMachine<Polynomial> svm = result.BestModel;

// Get an estimate for its error:
double bestError = result.BestModelError;

// Get the best values for its parameters:
double bestC = result.BestParameters["complexity"].Value;
double bestDegree = result.BestParameters["degree"].Value;
double bestConstant = result.BestParameters["constant"].Value;

Finally, it is also possible to combine grid-search with CrossValidation<TModel, TInput, TOutput>, as shown in the examples below:

// Ensure results are reproducible
Accord.Math.Random.Generator.Seed = 0;

// This is a sample code showing how to use Grid-Search in combination with 
// Cross-Validation  to assess the performance of Support Vector Machines.

// Consider the example binary data. We will be trying to learn a XOR 
// problem and see how well does SVMs perform on this data.

double[][] inputs =
{
    new double[] { -1, -1 }, new double[] {  1, -1 },
    new double[] { -1,  1 }, new double[] {  1,  1 },
    new double[] { -1, -1 }, new double[] {  1, -1 },
    new double[] { -1,  1 }, new double[] {  1,  1 },
    new double[] { -1, -1 }, new double[] {  1, -1 },
    new double[] { -1,  1 }, new double[] {  1,  1 },
    new double[] { -1, -1 }, new double[] {  1, -1 },
    new double[] { -1,  1 }, new double[] {  1,  1 },
};

int[] xor = // result of xor for the sample input data
{
    -1,       1,
     1,      -1,
    -1,       1,
     1,      -1,
    -1,       1,
     1,      -1,
    -1,       1,
     1,      -1,
};

// Create a new Grid-Search with Cross-Validation algorithm. Even though the
// generic, strongly-typed approach used accross the framework is most of the
// time easier to handle, combining those both methods in a single call can be
// difficult. For this reason. the framework offers a specialized method for
// combining those two algorirthms:
var gscv = GridSearch<double[], int>.CrossValidate(

    // Here we can specify the range of the parameters to be included in the search
    ranges: new
    {
        Complexity = GridSearch.Values(0.00000001, 5.20, 0.30, 0.50),
        Degree = GridSearch.Values(1, 10, 2, 3, 4, 5),
        Constant = GridSearch.Values(0, 1, 2),
    },

    // Indicate how learning algorithms for the models should be created
    learner: (p, ss) => new SequentialMinimalOptimization<Polynomial>
    {
        // Here, we can use the parameters we have specified above:
        Complexity = p.Complexity,
        Kernel = new Polynomial(p.Degree, p.Constant)
    },

    // Define how the model should be learned, if needed
    fit: (teacher, x, y, w) => teacher.Learn(x, y, w),

    // Define how the performance of the models should be measured
    loss: (actual, expected, r) => new ZeroOneLoss(expected).Loss(actual),

    folds: 3 // use k = 3 in k-fold cross validation
);

// If needed, control the parallelization degree
gscv.ParallelOptions.MaxDegreeOfParallelism = 1;

// Search for the best vector machine
var result = gscv.Learn(inputs, xor);

// Get the best cross-validation result:
var crossValidation = result.BestModel;

// Estimate its error:
double bestError = result.BestModelError;
double trainError = result.BestModel.Training.Mean;
double trainErrorVar = result.BestModel.Training.Variance;
double valError = result.BestModel.Validation.Mean;
double valErrorVar = result.BestModel.Validation.Variance;

// Get the best values for the parameters:
double bestC = result.BestParameters.Complexity;
double bestDegree = result.BestParameters.Degree;
double bestConstant = result.BestParameters.Constant;
// Ensure results are reproducible
Accord.Math.Random.Generator.Seed = 0;

// This is a sample code showing how to use Grid-Search in combination with 
// Cross-Validation  to assess the performance of Decision Trees with C4.5.

var parkinsons = new Parkinsons();
double[][] input = parkinsons.Features;
int[] output = parkinsons.ClassLabels;

// Create a new Grid-Search with Cross-Validation algorithm. Even though the
// generic, strongly-typed approach used accross the framework is most of the
// time easier to handle, combining those both methods in a single call can be
// difficult. For this reason. the framework offers a specialized method for
// combining those two algorirthms:
var gscv = GridSearch.CrossValidate(

    // Here we can specify the range of the parameters to be included in the search
    ranges: new
    {
        Join = GridSearch.Range(fromInclusive: 1, toExclusive: 20),
        MaxHeight = GridSearch.Range(fromInclusive: 1, toExclusive: 20),
    },

    // Indicate how learning algorithms for the models should be created
    learner: (p, ss) => new C45Learning
    {
        // Here, we can use the parameters we have specified above:
        Join = p.Join,
        MaxHeight = p.MaxHeight,
    },

    // Define how the model should be learned, if needed
    fit: (teacher, x, y, w) => teacher.Learn(x, y, w),

    // Define how the performance of the models should be measured
    loss: (actual, expected, r) => new ZeroOneLoss(expected).Loss(actual),

    folds: 3, // use k = 3 in k-fold cross validation

    x: input, y: output // so the compiler can infer generic types
);

// If needed, control the parallelization degree
gscv.ParallelOptions.MaxDegreeOfParallelism = 1;

// Search for the best decision tree
var result = gscv.Learn(input, output);

// Get the best cross-validation result:
var crossValidation = result.BestModel;

// Get an estimate of its error:
double bestAverageError = result.BestModelError;

double trainError = result.BestModel.Training.Mean;
double trainErrorVar = result.BestModel.Training.Variance;
double valError = result.BestModel.Validation.Mean;
double valErrorVar = result.BestModel.Validation.Variance;

// Get the best values for the parameters:
int bestJoin = result.BestParameters.Join;
int bestHeight = result.BestParameters.MaxHeight;

// Use the best parameter values to create the final 
// model using all the training and validation data:
var bestTeacher = new C45Learning
{
    Join = bestJoin,
    MaxHeight = bestHeight,
};

// Use the best parameters to create the final tree model:
DecisionTree finalTree = bestTeacher.Learn(input, output);
See Also