GridSearch< TInput, TOutput> Class |
Namespace: Accord.MachineLearning.Performance
The framework offers different ways to use grid search: one version is strongly-typed using generics and the other might need some manual casting. The exapmle below shows how to perform grid-search in a non-stringly typed way:
// Ensure results are reproducible Accord.Math.Random.Generator.Seed = 0; // Example binary data double[][] inputs = { new double[] { -1, -1 }, new double[] { -1, 1 }, new double[] { 1, -1 }, new double[] { 1, 1 } }; int[] xor = // xor labels { -1, 1, 1, -1 }; // Instantiate a new Grid Search algorithm for Kernel Support Vector Machines var gridsearch = new GridSearch<SupportVectorMachine<Polynomial>, double[], int>() { // Here we can specify the range of the parameters to be included in the search ParameterRanges = new GridSearchRangeCollection() { new GridSearchRange("complexity", new double[] { 0.00000001, 5.20, 0.30, 0.50 } ), new GridSearchRange("degree", new double[] { 1, 10, 2, 3, 4, 5 } ), new GridSearchRange("constant", new double[] { 0, 1, 2 } ) }, // Indicate how learning algorithms for the models should be created Learner = (p) => new SequentialMinimalOptimization<Polynomial> { Complexity = p["complexity"], Kernel = new Polynomial((int)p["degree"], p["constant"]) }, // Define how the performance of the models should be measured Loss = (actual, expected, m) => new ZeroOneLoss(expected).Loss(actual) }; // If needed, control the degree of CPU parallelization gridsearch.ParallelOptions.MaxDegreeOfParallelism = 1; // Search for the best model parameters var result = gridsearch.Learn(inputs, xor); // Get the best SVM found during the parameter search SupportVectorMachine<Polynomial> svm = result.BestModel; // Get an estimate for its error: double bestError = result.BestModelError; // Get the best values found for the model parameters: double bestC = result.BestParameters["complexity"].Value; double bestDegree = result.BestParameters["degree"].Value; double bestConstant = result.BestParameters["constant"].Value;
The main disadvantages of the method above is the need to keep string identifiers for each of the parameters being searched. Furthermore, it is also necessary to keep track of their types in order to cast them accordingly when using them in the specification of the Learner property.
The next example shows how to perform grid-search in a strongly typed way:
// Ensure results are reproducible Accord.Math.Random.Generator.Seed = 0; // This is a sample code showing how to use Grid-Search in combination with // Cross-Validation to assess the performance of Support Vector Machines. // Consider the example binary data. We will be trying to learn a XOR // problem and see how well does SVMs perform on this data. double[][] inputs = { new double[] { -1, -1 }, new double[] { 1, -1 }, new double[] { -1, 1 }, new double[] { 1, 1 }, new double[] { -1, -1 }, new double[] { 1, -1 }, new double[] { -1, 1 }, new double[] { 1, 1 }, new double[] { -1, -1 }, new double[] { 1, -1 }, new double[] { -1, 1 }, new double[] { 1, 1 }, new double[] { -1, -1 }, new double[] { 1, -1 }, new double[] { -1, 1 }, new double[] { 1, 1 }, }; int[] xor = // result of xor for the sample input data { -1, 1, 1, -1, -1, 1, 1, -1, -1, 1, 1, -1, -1, 1, 1, -1, }; // Create a new Grid-Search with Cross-Validation algorithm. Even though the // generic, strongly-typed approach used accross the framework is most of the // time easier to handle, meta-algorithms such as grid-search can be a bit hard // to setup. For this reason. the framework offers a specialized method for it: var gridsearch = GridSearch<double[], int>.Create( // Here we can specify the range of the parameters to be included in the search ranges: new { Kernel = GridSearch.Values<IKernel>(new Linear(), new ChiSquare(), new Gaussian(), new Sigmoid()), Complexity = GridSearch.Values(0.00000001, 5.20, 0.30, 0.50), Tolerance = GridSearch.Range(1e-10, 1.0, stepSize: 0.05) }, // Indicate how learning algorithms for the models should be created learner: (p) => new SequentialMinimalOptimization<IKernel> { Complexity = p.Complexity, Kernel = p.Kernel.Value, Tolerance = p.Tolerance }, // Define how the model should be learned, if needed fit: (teacher, x, y, w) => teacher.Learn(x, y, w), // Define how the performance of the models should be measured loss: (actual, expected, m) => new ZeroOneLoss(expected).Loss(actual) ); // If needed, control the degree of CPU parallelization gridsearch.ParallelOptions.MaxDegreeOfParallelism = 1; // Search for the best model parameters var result = gridsearch.Learn(inputs, xor); // Get the best SVM: SupportVectorMachine<IKernel> svm = result.BestModel; // Estimate its error: double bestError = result.BestModelError; // Get the best values for the parameters: double bestC = result.BestParameters.Complexity; double bestTolerance = result.BestParameters.Tolerance; IKernel bestKernel = result.BestParameters.Kernel.Value;
The code above uses anonymous types and generics to create a specialized GridSearch< TModel, TInput, TOutput> class that keeps the anonymous type given as ParameterRanges. Its main disadvantage is the (high) increase in type complexity, making the use of the var keyword almost mandatory.
It is also possible to create grid-search objects using convenience methods from the static GridSearch class:
// Ensure results are reproducible Accord.Math.Random.Generator.Seed = 0; // Example binary data double[][] inputs = { new double[] { -1, -1 }, new double[] { -1, 1 }, new double[] { 1, -1 }, new double[] { 1, 1 } }; int[] xor = // xor labels { -1, 1, 1, -1 }; // Instantiate a new Grid Search algorithm for Kernel Support Vector Machines var gridsearch = GridSearch<double[], int>.Create( ranges: new GridSearchRange[] { new GridSearchRange("complexity", new double[] { 0.00000001, 5.20, 0.30, 0.50 } ), new GridSearchRange("degree", new double[] { 1, 10, 2, 3, 4, 5 } ), new GridSearchRange("constant", new double[] { 0, 1, 2 } ) }, learner: (p) => new SequentialMinimalOptimization<Polynomial> { Complexity = p["complexity"], Kernel = new Polynomial((int)p["degree"].Value, p["constant"]) }, // Define how the model should be learned, if needed fit: (teacher, x, y, w) => teacher.Learn(x, y, w), // Define how the performance of the models should be measured loss: (actual, expected, m) => new ZeroOneLoss(expected).Loss(actual) ); // If needed, control the degree of CPU parallelization gridsearch.ParallelOptions.MaxDegreeOfParallelism = 1; // Search for the best model parameters var result = gridsearch.Learn(inputs, xor); // Get the best SVM generated during the search SupportVectorMachine<Polynomial> svm = result.BestModel; // Get an estimate for its error: double bestError = result.BestModelError; // Get the best values for its parameters: double bestC = result.BestParameters["complexity"].Value; double bestDegree = result.BestParameters["degree"].Value; double bestConstant = result.BestParameters["constant"].Value;
Finally, it is also possible to combine grid-search with CrossValidation<TModel, TInput, TOutput>, as shown in the examples below:
// Ensure results are reproducible Accord.Math.Random.Generator.Seed = 0; // This is a sample code showing how to use Grid-Search in combination with // Cross-Validation to assess the performance of Support Vector Machines. // Consider the example binary data. We will be trying to learn a XOR // problem and see how well does SVMs perform on this data. double[][] inputs = { new double[] { -1, -1 }, new double[] { 1, -1 }, new double[] { -1, 1 }, new double[] { 1, 1 }, new double[] { -1, -1 }, new double[] { 1, -1 }, new double[] { -1, 1 }, new double[] { 1, 1 }, new double[] { -1, -1 }, new double[] { 1, -1 }, new double[] { -1, 1 }, new double[] { 1, 1 }, new double[] { -1, -1 }, new double[] { 1, -1 }, new double[] { -1, 1 }, new double[] { 1, 1 }, }; int[] xor = // result of xor for the sample input data { -1, 1, 1, -1, -1, 1, 1, -1, -1, 1, 1, -1, -1, 1, 1, -1, }; // Create a new Grid-Search with Cross-Validation algorithm. Even though the // generic, strongly-typed approach used accross the framework is most of the // time easier to handle, combining those both methods in a single call can be // difficult. For this reason. the framework offers a specialized method for // combining those two algorirthms: var gscv = GridSearch<double[], int>.CrossValidate( // Here we can specify the range of the parameters to be included in the search ranges: new { Complexity = GridSearch.Values(0.00000001, 5.20, 0.30, 0.50), Degree = GridSearch.Values(1, 10, 2, 3, 4, 5), Constant = GridSearch.Values(0, 1, 2), }, // Indicate how learning algorithms for the models should be created learner: (p, ss) => new SequentialMinimalOptimization<Polynomial> { // Here, we can use the parameters we have specified above: Complexity = p.Complexity, Kernel = new Polynomial(p.Degree, p.Constant) }, // Define how the model should be learned, if needed fit: (teacher, x, y, w) => teacher.Learn(x, y, w), // Define how the performance of the models should be measured loss: (actual, expected, r) => new ZeroOneLoss(expected).Loss(actual), folds: 3 // use k = 3 in k-fold cross validation ); // If needed, control the parallelization degree gscv.ParallelOptions.MaxDegreeOfParallelism = 1; // Search for the best vector machine var result = gscv.Learn(inputs, xor); // Get the best cross-validation result: var crossValidation = result.BestModel; // Estimate its error: double bestError = result.BestModelError; double trainError = result.BestModel.Training.Mean; double trainErrorVar = result.BestModel.Training.Variance; double valError = result.BestModel.Validation.Mean; double valErrorVar = result.BestModel.Validation.Variance; // Get the best values for the parameters: double bestC = result.BestParameters.Complexity; double bestDegree = result.BestParameters.Degree; double bestConstant = result.BestParameters.Constant;
// Ensure results are reproducible Accord.Math.Random.Generator.Seed = 0; // This is a sample code showing how to use Grid-Search in combination with // Cross-Validation to assess the performance of Decision Trees with C4.5. var parkinsons = new Parkinsons(); double[][] input = parkinsons.Features; int[] output = parkinsons.ClassLabels; // Create a new Grid-Search with Cross-Validation algorithm. Even though the // generic, strongly-typed approach used accross the framework is most of the // time easier to handle, combining those both methods in a single call can be // difficult. For this reason. the framework offers a specialized method for // combining those two algorirthms: var gscv = GridSearch.CrossValidate( // Here we can specify the range of the parameters to be included in the search ranges: new { Join = GridSearch.Range(fromInclusive: 1, toExclusive: 20), MaxHeight = GridSearch.Range(fromInclusive: 1, toExclusive: 20), }, // Indicate how learning algorithms for the models should be created learner: (p, ss) => new C45Learning { // Here, we can use the parameters we have specified above: Join = p.Join, MaxHeight = p.MaxHeight, }, // Define how the model should be learned, if needed fit: (teacher, x, y, w) => teacher.Learn(x, y, w), // Define how the performance of the models should be measured loss: (actual, expected, r) => new ZeroOneLoss(expected).Loss(actual), folds: 3, // use k = 3 in k-fold cross validation x: input, y: output // so the compiler can infer generic types ); // If needed, control the parallelization degree gscv.ParallelOptions.MaxDegreeOfParallelism = 1; // Search for the best decision tree var result = gscv.Learn(input, output); // Get the best cross-validation result: var crossValidation = result.BestModel; // Get an estimate of its error: double bestAverageError = result.BestModelError; double trainError = result.BestModel.Training.Mean; double trainErrorVar = result.BestModel.Training.Variance; double valError = result.BestModel.Validation.Mean; double valErrorVar = result.BestModel.Validation.Variance; // Get the best values for the parameters: int bestJoin = result.BestParameters.Join; int bestHeight = result.BestParameters.MaxHeight; // Use the best parameter values to create the final // model using all the training and validation data: var bestTeacher = new C45Learning { Join = bestJoin, MaxHeight = bestHeight, }; // Use the best parameters to create the final tree model: DecisionTree finalTree = bestTeacher.Learn(input, output);