*60*

Two of the most commonly used regression models are **linear regression** and **logistic regression**.

Both types of regression models are used to quantify the relationship between one or more predictor variables and a response variable, but there are some key differences between the two models:

Hereâ€™s a summary of the differences:

**Difference #1: Type of Response Variable**

A linear regression model is used when the response variable takes on a continuous value such as:

- Price
- Height
- Age
- Distance

Conversely, a logistic regression model is used when the response variable takes on a categorical value such as:

- Yes or No
- Male or Female
- Win or Not Win

**Difference #2: Equation Used**

Linear regression uses the following equation to summarize the relationship between the predictor variable(s) and the response variable:

Y = Î²_{0} + Î²_{1}X_{1} + Î²_{2}X_{2} + â€¦ + Î²_{p}X_{p}

where:

- Y: The response variable
- X
_{j}: The j^{th}predictor variable - Î²
_{j}: The average effect on Y of a one unit increase in X_{j}, holding all other predictors fixed

Conversely, logistic regression uses the following equation:

p(X) = e^{Î²0 + Î²1X1 + Î²2X2 + â€¦ + Î²pXp} / (1 + e^{Î²0 + Î²1X1 + Î²2X2 + â€¦ + Î²pXp})

This equation is used to predict the probability that an individual observation falls into a certain category.

**Difference #3: Method Used to Fit Equation**

Linear regression uses a method known as **ordinary least squares** to find the best fitting regression equation.

Conversely, logistic regression uses a method known as **maximum likelihood estimation** to find the best fitting regression equation.

**Difference #4: Output to Predict**

Linear regression predicts a continuous value as the output. For example:

- Price ($150, $199, $400, etc.)
- Height (14 inches, 2 feet, 94.32 centimeters, etc.)
- Age (2 months, 6 years, 41.5 years, etc.)
- Distance (1.23 miles, 4.5 kilometers, etc.)

Conversely, logistic regression predicts probabilities as the output. For example:

- 40.3% chance of getting accepted to a university.
- 93.2% chance of winning a game.
- 34.2% chance of a law getting passed.

**When to Use Logistic vs. Linear Regression**

The following practice problems can help you gain a better understanding of when to use logistic regression or linear regression.

**Problem #1: Annual Income**

Suppose an economist wants to use predictor variables (1) weekly hours worked and (2) years of education to predict the annual income of individuals.

In this scenario, he would use **linear regression** because the response variable (annual income) is continuous.

**Problem #2: University Acceptance**

Suppose a college admissions officer wants to use the predictor variables (1) GPA and (2) ACT score to predict the probability that a student will get accepted into a certain university.

In this scenario, she would use **logistic regression** because the response variable is categorial and can only take on two values â€“ accepted or not accepted.

**Problem #3: Home Price**

Suppose a real estate agent wants to use the predictor variables (1) square footage, (2) number of bedrooms, and (3) number of bathrooms to predict the selling house of prices.

In this scenario, she would use **linear regression** because the response variable (price) is continuous.

**Problem #4: Spam Detection**

Suppose a computer programmer wants to use the predictor variables (1) number of words and (2) country of origin to predict the probability that a given email is spam.

In this scenario, he would use **logistic regression** because the response variable is categorical and can only take on two values â€“ spam or not spam.

**Additional Resources**

The following tutorials offer more details on linear regression:

- Introduction to Simple Linear Regression
- Introduction to Multiple Linear Regression
- 4 Examples of Using Linear Regression in Real Life

The following tutorials offer more details on logistic regression: