The output of a logistic regression model is a function that predicts the probability of an event as a function of the input parameter. This post will only look at a simple logistic regression model with one predictor, but similar analysis applies to multiple regression with several predictors.

Here’s a plot of such a curve when *a* = 3 and *b* = 4.

## Flattest part

The curvature of the logistic curve is small at both extremes. As *x* comes in from negative infinity, the curvature increases, then decreases to zero, then increases again, then decreases as *x* goes to positive infinity. We quantified this statement in another post where we calculate the curvature. The curvature is zero at the point where the second derivative of *p*

is zero, which occurs when *x* = –*a*/*b*. At that point *p* = 1/2, so the curve is flattest where the probability crosses 1/2. In the graph above, this happens at *x* = -0.75.

A little calculation shows that the slope at the flattest part of the logistic curve is simply *b*.

## Sensitivity to parameters

Now how much does the probability prediction *p*(*x*) change as the parameter *a* changes? We now need to consider *p* as a function of three variables, i.e. we need to consider *a* and *b* as additional variables. The marginal change in *p* in response to a change in *a* is the partial derivative of *p* with respect to *a*.

To know where this is maximized with respect to *x*, we take the partial derivative of the above expression with respect to *x*

which is zero when *x* = –*a*/*b*, the same place where the logistic curve is flattest. And the partial of *p* with respect to *a* at that point is simply 1/4, independent of *b*. So a small change Δ*a* results in a change of approximately Δ*a*/4 at the flattest part of the logistic curve and results in less change elsewhere.

What about the dependence on *b*? That’s more complicated. The rate of change of *p* with respect to *b* is

and this is maximized where

which in turn requires solving a nonlinear equation. This is easy to do numerically in a specific case, but not easy to work with analytically in general.

However, we can easily say how *p* changes with *b* near the point *x* = –*a*/*b*. This is not where the partial of *p* with respect to *b* is maximized, but it’s a place of interest because it has come up two times above. At that point the derivative of *p* with respect to *b* is –*a*/4*b*. So if *a* and *b* have the same sign, then a small increase in *b* will result in a small decrease in *p* and vice versa.

Dear Dr Cook,

thanks for your great blog which i love reading.

I have a question regarsing logistic regression: Am I correct to say that logistic regression forces the values on the y axis to be between 1 and 0 and that it estimates parameters beta-zero to beta-n (for multivariate logistic regression) just like in linear regression? In the one predcitor case one gets a sigmoid curve like the one shown in your post above, whereas in the two predcitor case one would get a “sigmoid plane” etc. The point on the curve gives us then the probability for the event.

Could one use then logistic regression as a slassifcation algorithm by apllying this algorithm:

If (p0.5) then Class 1

?

Greetings from Germany

Thomas