Beyond Precision and Recall: A Deep Dive Deep into the Tversky Index

Artificial Intelligence

Beyond Precision and Recall: A Deep Dive Deep into the Tversky Index

admin

September 2, 2023

Beyond Precision and Recall: A Deep Dive Deep into the Tversky Index

Exploring another classification metric

On this planet of knowledge science, metrics are the compass that guide our models to success. While many are acquainted with the classic measures of precision and recall, there are literally a big selection of other options which can be value exploring.

In this text, we’ll dive into the Tversky index. This metric, a generalization of the Dice and Jaccard coefficients, may be extremely useful when attempting to balance precision and recall against one another. When implemented as a loss function for neural networks, it could possibly be a robust option to take care of class imbalances.

A fast refresher on precision and recall

Imagine you might be a detective tasked with capturing criminals in your town. In fact, there are 10 criminals roaming the streets.

In your first month, you usher in 8 suspects you assume to be criminals. Only 4 of them find yourself being guilty, while the opposite 4 are innocent.

Should you were a machine learning model, you’d be evaluated against your precision and recall.

Precision asks: “of all those you caught, what number of were criminals?”

Recall asks: “of all of the criminals within the town, what number of did you catch?”

Precision is a metric that captures how accurate your predictions are, not counting what number of true positives you miss (false negatives). Recall measures how lots of the true positives you capture, regardless of what number of false positives you get.

How do your detective skills rate against these metrics?

precision = 4 / (4 + 4) = 0.5
recall = 4 / (4 + 6) = 0.4

Balancing precision and recall: the F1 metric

In a really perfect world, your classifier has each high precision and high recall. As a measure of how well your classifier is doing against each, the F1 statistic measures the harmonic mean between the 2:

This metric can be sometimes called the Dice similarity coefficient (DSC).