Look I don’t know who thought it would a good idea to pick two nearly identical words for these two terms, but sensitivity and specificity are fundamental concepts for understanding classification so we’ve got to get them down. This post will go over the definition of these terms and give some examples for each one.
Scenario
We are designing an algorithm that identifies spam emails. If an email is identified as spam by our program it will be sent to the user’s spam box, if not it will be sent to the user’s inbox.
Under these guidelines there are four possible classification outcomes for each email by our program…
- A spam email is correctly sent to the spam box
- A spam email is incorrectly let into the inbox
- A non-spam email is correctly sent into the inbox
- A non-spam email is incorrectly sent into the spam box
Defining Terms
Sensitivity
Sensitivity refers to the true positive rate or the rate of correct positive identification. In other words, how many times did we correctly identify the positive condition. In our case, since we are looking for spam emails, this would be the rate of correctly identifying spam emails. (Outcome 1)
Specificity
Specificity refers to the true negative rate or the rate of correct negative identification. In other words, how many times did we correctly identify the negative condition. In our case, this would be the number of non-spam emails we correctly identified. (Outcome 3)
Putting Numbers to the Example
Ok let’s pretend we ran the program on 200 emails. Out of these, 100 are spam and 100 are non-spam. These are the results…
- 97 spam emails are correctly classified as spam
- 3 spam emails are incorrectly classified as non-spam
- 60 non-spam emails are correctly classified as non-spam
- 40 non-spam emails are incorrectly classified as spam
Let’s visualize this in a table…
Sensitivity of Our Program
Sensitivity is calculated by dividing the number of correctly classified positives by the total number of true positives and false negatives. In our case the number of correctly identified spam emails by the total number of actual spam emails.
Specificity of Our Program
Specificity is calculated by dividing the number of correctly identify negatives by the total number of true negatives and false negatives. In our case, this would be the number of non-spam by the total number of non-spam emails.
Conclusion
So why do we go through all this trouble to calculate sensitivity and specificity? The answer is that these measurements are critical in understanding the successes and shortcomings of classifiers. In the case of our program, it correctly identifies spam 97% of the time. This looks great, but if we look at the specificity the program only identifies non-spam emails 60% of the time. Imagine if four out of every ten non-spam emails you received went to your spam box. That would drive me insane. It is not until we check both the sensitivity and the specificity that we see the true scope of our classifier.
Taking this idea further, imagine if instead of an email classifier we were making a diagnostic test for cancer. How important are sensitivity and specificity? If your sensitivity is too low you might tell someone they have a disease that they don’t have. If your specificity is too low you might tell someone who has a disease that they are healthy. These are real questions that must be answered, but we’ll save the discussion of medial ethics for another time. 😉
Thanks for the read and please let me know your thoughts on this topic.
Categories: Uncategorized