MultinomialLogisticRegressionAnalysis Class

Multinomial Logistic Regression Analysis

Inheritance Hierarchy

SystemObject
Accord.MachineLearningTransformBaseDouble, Int32
Accord.Statistics.AnalysisMultinomialLogisticRegressionAnalysis

Namespace: Accord.Statistics.Analysis
Assembly: Accord.Statistics (in Accord.Statistics.dll) Version: 3.8.0

Syntax

Copy

[SerializableAttribute]
public class MultinomialLogisticRegressionAnalysis : TransformBase<double[], int>, 
	IMultivariateRegressionAnalysis, IMultivariateAnalysis, IAnalysis, ISupervisedLearning<MultinomialLogisticRegression, double[], double[]>, 
	ISupervisedLearning<MultinomialLogisticRegression, double[], int>

<SerializableAttribute>
Public Class MultinomialLogisticRegressionAnalysis
	Inherits TransformBase(Of Double(), Integer)
	Implements IMultivariateRegressionAnalysis, IMultivariateAnalysis, IAnalysis, ISupervisedLearning(Of MultinomialLogisticRegression, Double(), Double()), 
	ISupervisedLearning(Of MultinomialLogisticRegression, Double(), Integer)

Request Example View Source

The MultinomialLogisticRegressionAnalysis type exposes the following members.

Constructors

	Name	Description
	MultinomialLogisticRegressionAnalysis	Constructs a Multiple Linear Regression Analysis.
	MultinomialLogisticRegressionAnalysis(Double, Double)	Obsolete. Constructs a Multinomial Logistic Regression Analysis.
	MultinomialLogisticRegressionAnalysis(Double, Int32)	Obsolete. Constructs a Multinomial Logistic Regression Analysis.
	MultinomialLogisticRegressionAnalysis(String, String)	Constructs a Multiple Linear Regression Analysis.
	MultinomialLogisticRegressionAnalysis(Double, Double, String, String)	Obsolete. Constructs a Multiple Linear Regression Analysis.
	MultinomialLogisticRegressionAnalysis(Double, Int32, String, String)	Obsolete. Constructs a Multiple Linear Regression Analysis.

Top

Properties

	Name	Description
	Array	Obsolete. Source data used in the analysis.
	ChiSquare	Gets the Chi-Square (Likelihood Ratio) Test for the model.
	Coefficients	Gets the collection of coefficients of the model.
	CoefficientValues	Gets the value of each coefficient.
	Confidences	Gets the Confidence Intervals (C.I.) for each coefficient found in the regression.
	Deviance	Gets the Deviance of the model.
	InputNames	Gets or sets the name of the input variables for the model.
	Inputs	Obsolete. Obsolete. Please use InputNames instead.
	Iterations	Gets or sets the maximum number of iterations to be performed by the regression algorithm. Default is 50.
	LogLikelihood	Gets the Log-Likelihood for the model.
	NumberOfInputs	Gets the number of inputs accepted by the model. (Inherited from TransformBaseTInput, TOutput.)
	NumberOfOutputs	Gets the number of outputs generated by the model. (Inherited from TransformBaseTInput, TOutput.)
	Output	Obsolete. Gets the dependent variable value for each of the source input points.
	OutputCount	Gets the number of outputs in the regression problem.
	OutputNames	Gets or sets the name of the output variable for the model.
	Outputs	Obsolete. Gets the dependent variable value for each of the source input points.
	Regression	Gets the Regression model created and evaluated by this analysis.
	Results	Obsolete. Gets the resulting values obtained by the regression model.
	Source	Obsolete. Source data used in the analysis.
	StandardErrors	Gets the Standard Error for each coefficient found during the logistic regression.
	Token	Gets or sets a cancellation token that can be used to stop the learning algorithm while it is running.
	Tolerance	Gets or sets the difference between two iterations of the regression algorithm when the algorithm should stop. The difference is calculated based on the largest absolute parameter change of the regression. Default is 1e-5.
	WaldTests	Gets the Wald Tests for each coefficient.

Top

Methods

	Name	Description
	Compute	Obsolete. Computes the Multinomial Logistic Regression Analysis.
	Equals	Determines whether the specified object is equal to the current object. (Inherited from Object.)
	Finalize	Allows an object to try to free resources and perform other cleanup operations before it is reclaimed by garbage collection. (Inherited from Object.)
	GetHashCode	Serves as the default hash function. (Inherited from Object.)
	GetType	Gets the Type of the current instance. (Inherited from Object.)
	Learn(Double, Double, Double)	Learns a model that can map the given inputs to the given outputs.
	Learn(Double, Int32, Double)	Learns a model that can map the given inputs to the given outputs.
	Learn(Int32, Int32, Double)	Learns a model that can map the given inputs to the given outputs.
	MemberwiseClone	Creates a shallow copy of the current Object. (Inherited from Object.)
	ToString	Returns a string that represents the current object. (Inherited from Object.)
	Transform(TInput)	Applies the transformation to a set of input vectors, producing an associated set of output vectors. (Inherited from TransformBaseTInput, TOutput.)
	Transform(Double)	Applies the transformation to an input, producing an associated output. (Overrides TransformBaseTInput, TOutputTransform(TInput).)
	Transform(TInput, TOutput)	Applies the transformation to an input, producing an associated output. (Inherited from TransformBaseTInput, TOutput.)

Top

Extension Methods

	Name	Description
	HasMethod	Checks whether an object implements a method with the given name. (Defined by ExtensionMethods.)
	IsEqual	Compares two objects for equality, performing an elementwise comparison if the elements are vectors or matrices. (Defined by Matrix.)
	To(Type)	Overloaded. Converts an object into another type, irrespective of whether the conversion can be done at compile time or not. This can be used to convert generic types to numeric types during runtime. (Defined by ExtensionMethods.)
	ToT	Overloaded. Converts an object into another type, irrespective of whether the conversion can be done at compile time or not. This can be used to convert generic types to numeric types during runtime. (Defined by ExtensionMethods.)

Top

Remarks

In statistics, multinomial logistic regression is a classification method that generalizes logistic regression to multiclass problems, i.e. with more than two possible discrete outcomes.[1] That is, it is a model that is used to predict the probabilities of the different possible outcomes of a categorically distributed dependent variable, given a set of independent variables (which may be real-valued, binary-valued, categorical-valued, etc.).

Multinomial logistic regression is known by a variety of other names, including multiclass LR, multinomial regression,[2] softmax regression, multinomial logit, maximum entropy (MaxEnt) classifier, conditional maximum entropy model.

para>

References:

Wikipedia contributors. "Multinomial logistic regression." Wikipedia, The Free Encyclopedia, 1st April, 2015. Available at: https://en.wikipedia.org/wiki/Multinomial_logistic_regression

Examples

The first example shows how to reproduce a textbook example using categorical and categorical-with-baseline variables. Those variables can be transformed/factored to their respective representations using the Codification class. However, please note that while this example uses features from the Codification class, the use of this class is not required when learning a MultinomialLogisticRegression modoel.

Copy

// This example downloads an example dataset from the web and learns a multinomial logistic 
// regression on it. However, please keep in mind that the Multinomial Logistic Regression 
// can also work without many of the elements that will be shown below, like the codebook, 
// DataTables, and a CsvReader. 

// Let's download an example dataset from the web to learn a multinomial logistic regression:
CsvReader reader = CsvReader.FromUrl("https://raw.githubusercontent.com/rlowrance/re/master/hsbdemo.csv", hasHeaders: true);

// Let's read the CSV into a DataTable. As mentioned above, this step
// can help, but is not necessarily required for learning a the model:
DataTable table = reader.ToTable();

// We will learn a MLR regression between the following input and output fields of this table:
string[] inputNames = new[] { "write", "ses" };
string[] outputNames = new[] { "prog" };

// Now let's create a codification codebook to convert the string fields in the data 
// into integer symbols. This is required because the MLR model can only learn from 
// numeric data, so strings have to be transformed first. We can force a particular
// interpretation for those columns if needed, as shown in the initializer below:
var codification = new Codification()
{
    { "write", CodificationVariable.Continuous },
    { "ses", CodificationVariable.CategoricalWithBaseline, new[] { "low", "middle", "high" } },
    { "prog", CodificationVariable.Categorical, new[] { "academic", "general" } },
};

// Learn the codification
codification.Learn(table);

// Now, transform symbols into a vector representation, growing the number of inputs:
double[][] x = codification.Transform(table, inputNames, out inputNames).ToDouble();
double[][] y = codification.Transform(table, outputNames, out outputNames).ToDouble();

// Create a new Multinomial Logistic Regression Analysis:
var analysis = new MultinomialLogisticRegressionAnalysis()
{
    InputNames = inputNames,
    OutputNames = outputNames,
};

// Learn the regression from the input and output pairs:
MultinomialLogisticRegression regression = analysis.Learn(x, y);

// Let's retrieve some information about what we just learned:
int coefficients = analysis.Coefficients.Count; // should be 9
int numberOfInputs = analysis.NumberOfInputs;   // should be 3
int numberOfOutputs = analysis.NumberOfOutputs; // should be 3

inputNames = analysis.InputNames; // should be "write", "ses: middle", "ses: high"
outputNames = analysis.OutputNames; // should be "prog: academic", "prog: general", "prog: vocation"

// The regression is best visualized when it is data-bound to a 
// Windows.Forms DataGridView or WPF DataGrid. You can get the
// values for all different coefficients and discrete values:

// DataGridBox.Show(regression.Coefficients); // uncomment this line

// You can get the matrix of coefficients:
double[][] coef = analysis.CoefficientValues;

// Should be equal to:
double[][] expectedCoef = new double[][]
{
    new double[] { 2.85217775752471, -0.0579282723520426, -0.533293368378012, -1.16283850605289 },
    new double[] { 5.21813357698422, -0.113601186660817, 0.291387041358367, -0.9826369387481 }
};

// And their associated standard errors:
double[][] stdErr = analysis.StandardErrors;

// Should be equal to:
double[][] expectedErr = new double[][]
{
    new double[] { -2.02458003380033, -0.339533576505471, -1.164084923948, -0.520961533343425, 0.0556314901718 },
    new double[] { -3.73971589217449, -1.47672790071382, -1.76795568348094, -0.495032307980058, 0.113563519656386 }
};

// We can also get statistics and hypothesis tests:
WaldTest[][] wald = analysis.WaldTests;        // should all have p < 0.05
ChiSquareTest chiSquare = analysis.ChiSquare;  // should be p=1.06300120956871E-08
double logLikelihood = analysis.LogLikelihood; // should be -179.98173272217591

// You can use the regression to predict the values:
int[] pred = regression.Transform(x);

// And get the accuracy of the prediction if needed:
var cm = GeneralConfusionMatrix.Estimate(regression, x, y.ArgMax(dimension: 1));

double acc = cm.Accuracy; // should be 0.61
double kappa = cm.Kappa;  // should be 0.2993487536492252

The second example shows how to learn a MultinomialLogisticRegressionAnalysis from the famous Fisher's Iris dataset. This example should demonstrate that Codification filters are not required to successfully learn multinomial logistic regression analyses.

Copy

// This example shows how to learn a multinomial logistic regression
// analysis in the famous Fisher's Iris dataset. It should serve to
// demonstrate that this class does not really need to be used with
// DataTables, Codification codebooks and other supplementary features.

Iris iris = new Iris();

// Load Fisher's Iris dataset:
double[][] x = iris.Instances;
int[] y = iris.ClassLabels;

// Create a new Multinomial Logistic Regression Analysis:
var analysis = new MultinomialLogisticRegressionAnalysis();

// Note: we could have passed the class names from iris.ClassNames and 
// variable names from iris.VariableNames during MLR instantiation as:
// 
// var analysis = new MultinomialLogisticRegressionAnalysis()
// {
//     InputNames = iris.VariableNames,
//     OutputNames = iris.ClassNames
// };

// However, this example is also intended to demonstrate that 
// those are not required when learning a regression analysis.

// Learn the regression from the input and output pairs:
MultinomialLogisticRegression regression = analysis.Learn(x, y);

// Let's retrieve some information about what we just learned:
int coefficients = analysis.Coefficients.Count; // should be 11
int numberOfInputs = analysis.NumberOfInputs;   // should be 4
int numberOfOutputs = analysis.NumberOfOutputs; // should be 3

string[] inputNames = analysis.InputNames; // should be "Input 1", "Input 2", "Input 3", "Input 4"
string[] outputNames = analysis.OutputNames; // should be "Class 0", "class 1", "class 2"

// The regression is best visualized when it is data-bound to a 
// Windows.Forms DataGridView or WPF DataGrid. You can get the
// values for all different coefficients and discrete values:

// DataGridBox.Show(regression.Coefficients); // uncomment this line

// You can get the matrix of coefficients:
double[][] coef = analysis.CoefficientValues;

// Should be equal to:
double[][] expectedCoef = new double[][]
{
    new double[] { 2.85217775752471, -0.0579282723520426, -0.533293368378012, -1.16283850605289 },
    new double[] { 5.21813357698422, -0.113601186660817, 0.291387041358367, -0.9826369387481 }
};

// And their associated standard errors:
double[][] stdErr = analysis.StandardErrors;

// Should be equal to:
double[][] expectedErr = new double[][]
{
    new double[] { -2.02458003380033, -0.339533576505471, -1.164084923948, -0.520961533343425, 0.0556314901718 },
    new double[] { -3.73971589217449, -1.47672790071382, -1.76795568348094, -0.495032307980058, 0.113563519656386 }
};

// We can also get statistics and hypothesis tests:
WaldTest[][] wald = analysis.WaldTests;        // should all have p < 0.05
ChiSquareTest chiSquare = analysis.ChiSquare;  // should be p=0
double logLikelihood = analysis.LogLikelihood; // should be -29.558338705646587

// You can use the regression to predict the values:
int[] pred = regression.Transform(x);

// And get the accuracy of the prediction if needed:
var cm = GeneralConfusionMatrix.Estimate(regression, x, y);

double acc = cm.Accuracy; // should be 0.94666666666666666
double kappa = cm.Kappa;  // should be 0.91999999999999982

Reference

Accord.Statistics.Analysis Namespace