Logistic Regression

/ˈlɒdʒ.ɪ.stɪk rɪˈɡrɛʃ.ən/

noun … “predicting probabilities with a curve, not a line.”

Logistic Regression is a statistical and machine learning technique used for modeling the probability of a binary or categorical outcome based on one or more predictor variables. Unlike Linear Regression, which predicts continuous values, Logistic Regression maps predictions to probabilities constrained between 0 and 1 using the logistic (sigmoid) function. This makes it ideal for classification tasks, such as predicting whether a customer will churn, whether a tumor is malignant, or whether an email is spam.

Mathematically, the model estimates the log-odds of the outcome as a linear combination of predictors:

log(p / (1 - p)) = β₀ + β₁X₁ + β₂X₂ + ... + βₙXₙ

Here, p is the probability of the positive class, β₀ the intercept, β₁ … βₙ the coefficients, and X₁ … Xₙ the predictor variables. The coefficients are typically estimated using Maximum Likelihood Estimation (MLE), which finds the parameter values that maximize the probability of observing the given data.

Logistic Regression connects naturally to multiple statistical and machine learning concepts. It relies on Expectation Values for interpreting predicted probabilities, Variance to assess uncertainty, and can be extended with regularization methods like Ridge Regression or Lasso Regression to prevent overfitting. It also interacts with metrics such as the confusion matrix, ROC curves, and cross-entropy loss for model evaluation.

Example conceptual workflow for Logistic Regression:

collect dataset with predictor variables and binary outcome
explore and preprocess data, including encoding categorical features
fit logistic regression model using Maximum Likelihood Estimation
evaluate predicted probabilities and classification accuracy
apply regularization if necessary to prevent overfitting
use model to predict probabilities and classify new observations

Intuitively, Logistic Regression is like a probabilistic switch: it translates a weighted sum of inputs into a likelihood, gently curving predictions between 0 and 1, rather than extending endlessly like a straight line. It transforms linear relationships into interpretable probability forecasts, providing a bridge between numerical predictors and real-world categorical decisions.

Modeling

Data

Compute