Understanding LogisticRegression prediction details in Scikit-Learn

In the following, I briefly show how coef_, intercept_, decision_function, predict_proba and predict are connected in case of a binary LogisticRegression model.

Assume we have trained a model like this:

>>> ...
>>> lrmodel = linear_model.LogisticRegression(C=0.1, class_weight='balanced')
>>> lrmodel.fit(X_train, y_train)

The model’s coef_ attribute represents learned feature weights (w) and intercept_ represents the bias (b). Then the decision_function is equivalent to a matrix of x · w + b:

>>> (X_test @ lrmodel.coef_[0].T + lrmodel.intercept_)[:5]
    array([-0.09915005,  0.17611527, -0.14162106, -0.03107271, -0.01813942])

>>> lrmodel.decision_function(X_test)[:5]
    array([-0.09915005,  0.17611527, -0.14162106, -0.03107271, -0.01813942])

Now, if we take sigmoid of the decision function:

>>> def sigmoid(X): return 1 / (1 + np.exp(-X))

>>> sigmoid(X_test @ lrmodel.coef_[0].T + lrmodel.intercept_)[:5]
    array([ 0.47523277,  0.54391537,  0.46465379,  0.49223245,  0.49546527])

>>> sigmoid(lrmodel.decision_function(X_test))[:5]
    array([ 0.47523277,  0.54391537,  0.46465379,  0.49223245,  0.49546527])

it will be equivalent to the output of predict_proba (each touple is probabilities for -1 and 1). We see that these numbers are exactly the second column (the positive class) here:

>>> lrmodel.predict_proba(X_test)[:5]
    array([[ 0.52476723,  0.47523277],
           [ 0.45608463,  0.54391537],
           [ 0.53534621,  0.46465379],
           [ 0.50776755,  0.49223245],
           [ 0.50453473,  0.49546527]])

Finally, the predict function:

>>> lrmodel.predict(X_test)[:5]
    array([-1,  1, -1, -1, -1], dtype=int64)

is eqivalent to:

>>> [-1 if np.argmax(p) == 0 else 1 for p in lrmodel.predict_proba(X_test)] [:5]
    [-1, 1, -1, -1, -1]

which in our case is:

>>> [1 if p[1] > 0.5 else -1 for p in lrmodel.predict_proba(X_test)] [:5]
    [-1, 1, -1, -1, -1]

Understanding LogisticRegression prediction details in Scikit-Learn

March 31, 2018

News recommendation with ML and NLP

Running Open AI Gym on Windows 10

Get started with Flutter in 30 minutes