10.3 Odds and Probability
Logistic regression estimates the probability of belonging to each class. In the binary case we get an estimate of the probability of the outcome variable being true. We also get the probability of the outcome variable being not true.
Logistic Regression is designed to produce the odds of a record belonging to a class. These odds can then be converted to probabilities. Before explaining more about how logistic regression works, let’s review how odds are similar but different from probabilities.
Formulas for Odds and Probability
Assume that in a sample of one thousand people, 400 have new phones. The remaining 600 do not have new phones. Calculate the odds of a person having a new phone. Then calculate the probability of a person having a new phone.
The components to calculate odds and probabilities are the same. They are just combined in a different way:
Since odds and probabilities are ratios, the numbers in the ratio can be simplied by dividing by a common denominator. In the phone example, the ratio simplies to 2 for, 3 against, and 5 total.
Odds are a ratio of the number of fors (or successes) to the number of againsts (or failures). The equation for odds is for:against (e.g. 1:6). In the phone example, the odds of having a new phone are 2:3. Another common way odds are expressed is 2 to 3 odds for having a phone.
Data mining tools, however, express odds by converting the ratio to a single number. This is done by dividing the for number on the left and the the against number on the right by the number on the right. For example, 2:3 can be converted to single number form by dividing both sides by the number on the right 2/3:3/3 = .667:1. The one on the right side of the colon is implied, so this simplies to 0.6667, which is the single number form of the odds. As another example odds of 6:1 can be converted into single number form of 6. 1 to 1 odds are even odds, which mean the sample is evenly split between fors and against.
The equation for probability is #for / (#for + #against) which is also expressed as #successes / (#successes + #failures). For the phone example above, the probability of a person having a new phone is 2/5 = 0.40.
Converting between Odds and Probability
The examples below show how to convert from odds to probability and from probability to odds.
The following video walks through the process of achieving probabilities through logistic regression.