Beyond ROC-AUC and KS: The Gini Coefficient, Explained Simply

discussed about classification metrics like ROC-AUC and Kolmogorov-Smirnov (KS) Statistic in previous blogs.

On this blog, we are going to explore one other vital classification metric called the Gini Coefficient.

Why do we’ve multiple classification metrics?

Every classification metric tells us the model performance from a special angle. We all know that ROC-AUC gives us the general rating ability of a model, while KS Statistic shows us where the utmost gap between two groups occurs.

On the subject of the Gini Coefficient, it tells us how significantly better our model is than random guessing at rating the positives higher than the negatives.

First, let’s see how the Gini Coefficient is calculated.

For this, we again use the German Credit Dataset.

Let’s use the identical sample data that we used to know the calculation of Kolmogorov-Smirnov (KS) Statistic.

Image by Creator

This sample data was obtained by applying logistic regression on the German Credit dataset.

For the reason that model outputs probabilities, we chosen a sample of 10 points from those probabilities to display the calculation of the Gini coefficient.

Calculation

Step 1: Sort the information by predicted probabilities.

The sample data is already sorted descending by predicting probabilities.

Step 2: Compute Cumulative Population and Cumulative Positives.

Cumulative Population: The cumulative variety of records considered as much as that row.

Cumulative Population (%): The proportion of the entire population covered to this point.

Cumulative Positives: What number of actual positives (class 2) we’ve seen up thus far.

Cumulative Positives (%): The proportion of positives captured to this point.

Step 3: Plot X and Y values

X = Cumulative Population (%)

Y = Cumulative Positives (%)

Here, let’s use Python to plot these X and Y values.

Code:

import matplotlib.pyplot as plt

X = [0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0]
Y = [0.0, 0.25, 0.50, 0.75, 0.75, 1.00, 1.00, 1.00, 1.00, 1.00, 1.00]

# Plot curve
plt.figure(figsize=(6,6))
plt.plot(X, Y, marker='o', color="cornflowerblue", label="Model Lorenz Curve")
plt.plot([0,1], [0,1], linestyle="--", color="gray", label="Random Model (Diagonal)")
plt.title("Lorenz Curve from Sample Data", fontsize=14)
plt.xlabel("Cumulative Population % (X)", fontsize=12)
plt.ylabel("Cumulative Positives % (Y)", fontsize=12)
plt.legend()
plt.grid(True)
plt.show()

Plot:

The curve we get after we plot Cumulative Population (%) and Cumulative Positives (%) is named the Lorenz curve.

Step 4: Calculate the realm under the Lorenz curve.

Once we discussed ROC-AUC, we found the realm under the curve using the trapezoid formula.

Each region between two points was treated as a trapezoid, its area was calculated, after which all areas were added together to get the ultimate value.

The identical method is applied here to calculate the realm under the Lorenz curve.

Area under the Lorenz curve

Area of Trapezoid:

$$
text{Area} = frac{1}{2} times (y_1 + y_2) times (x_2 – x_1)
$$

From (0.0, 0.0) to (0.1, 0.25):
[
A_1 = frac{1}{2}(0+0.25)(0.1-0.0) = 0.0125
]

From (0.1, 0.25) to (0.2, 0.50):
[
A_2 = frac{1}{2}(0.25+0.50)(0.2-0.1) = 0.0375
]

From (0.2, 0.50) to (0.3, 0.75):
[
A_3 = frac{1}{2}(0.50+0.75)(0.3-0.2) = 0.0625
]

From (0.3, 0.75) to (0.4, 0.75):
[
A_4 = frac{1}{2}(0.75+0.75)(0.4-0.3) = 0.075
]

From (0.4, 0.75) to (0.5, 1.00):
[
A_5 = frac{1}{2}(0.75+1.00)(0.5-0.4) = 0.0875
]

From (0.5, 1.00) to (0.6, 1.00):
[
A_6 = frac{1}{2}(1.00+1.00)(0.6-0.5) = 0.100
]

From (0.6, 1.00) to (0.7, 1.00):
[
A_7 = frac{1}{2}(1.00+1.00)(0.7-0.6) = 0.100
]

From (0.7, 1.00) to (0.8, 1.00):
[
A_8 = frac{1}{2}(1.00+1.00)(0.8-0.7) = 0.100
]

From (0.8, 1.00) to (0.9, 1.00):
[
A_9 = frac{1}{2}(1.00+1.00)(0.9-0.8) = 0.100
]

From (0.9, 1.00) to (1.0, 1.00):
[
A_{10} = frac{1}{2}(1.00+1.00)(1.0-0.9) = 0.100
]

Total Area Under Lorenz Curve:
[
A = 0.0125+0.0375+0.0625+0.075+0.0875+0.100+0.100+0.100+0.100+0.100 = 0.775
]

We calculated the realm under the Lorenz curve, which is 0.775.

Here, we plotted Cumulative Population (%) and Cumulative Positives (%), and we will observe that the realm under this curve shows how quickly the positives (class 2) are being captured as we move down the sorted list.

In our sample dataset, we’ve 4 positives (class 2) and 6 negatives (class 1).

For an ideal model, by the point we reach 40% of the population, it captures 100% of the positives.

The curve looks like this for an ideal model.

Area under the lorenz curve for the right model.

[
begin{aligned}
text{Perfect Area} &= text{Triangle (0,0 to 0.4,1)} + text{Rectangle (0.4,1 to 1,1)} [6pt]
&= frac{1}{2} times 0.4 times 1 ;+; 0.6 times 1 [6pt]
&= 0.2 + 0.6 [6pt]
&= 0.8
end{aligned}
]

We even have one other method to calculate the Area under the curve for the right model.

[
text{Let }pi text{ be the proportion of positives in the dataset.}
]

[
text{Perfect Area} = frac{1}{2}pi cdot 1 + (1-pi)cdot 1
]
[
= frac{pi}{2} + (1-pi)
]
[
= 1 – frac{pi}{2}
]

For our dataset:

Here, we’ve 4 positives out of 10 records, so: π = 4/10 = 0.4.

[
text{Perfect Area} = 1 – frac{0.4}{2} = 1 – 0.2 = 0.8
]

We calculated the realm under the lorenz curve for our sample dataset and likewise for the right model with same variety of positives and negatives.

Now, if we undergo the dataset without sorting, the positives are evenly unfolded. This implies the speed at which we collect positives is identical as the speed at which we move through the population.

That is the random model, and it all the time gives an area under the curve of 0.5.

Step 5: Calculate the Gini Coefficient

[
A_{text{model}} = 0.775
]

[
A_{text{random}} = 0.5
]
[
A_{text{perfect}} = 0.8
]
[
text{Gini} = frac{A_{text{model}} – A_{text{random}}}{A_{text{perfect}} – A_{text{random}}}
]
[
= frac{0.775 – 0.5}{0.8 – 0.5}
]
[
= frac{0.275}{0.3}
]
[
approx 0.92
]

We got Gini = 0.92, which implies just about all the positives are concentrated at the highest of the sorted list. This shows that the model does a superb job of separating positives from negatives, coming near perfect.

As we’ve seen how the Gini Coefficient is calculated, let’s take a look at what we actually did throughout the calculation.

We considered a sample of 10 points consisting of output probabilities from logistic regression.

We sorted the chances in descending order.

Next, we calculated Cumulative Population (%) and Cumulative Positives (%) after which plotted them.

We got a curve called the Lorenz curve, and we calculated the realm under it, which is 0.775.

Now let’s understand what’s 0.775?

Our sample consists of 4 positives (class 2) and 6 negatives (class 1).

The output probabilities are for sophistication 2, which implies the upper the probability, the more likely the client belongs to class 2.

In our sample data, the positives are captured inside 50% of the population, which implies all of the positives are ranked at the highest.

If the model is ideal, then the positives are captured inside the first 4 rows, i.e., inside the first 40% of the population, and the realm under the curve for the right model is 0.8.

But we got AUC = 0.775, which is sort of perfect.

Here, we are attempting to calculate the efficiency of the model. If more positives are concentrated at the highest, it means the model is nice at classifying positives and negatives.

Next, we calculated the Gini Coefficient, which is 0.92.

[
text{Gini} = frac{A_{text{model}} – A_{text{random}}}{A_{text{perfect}} – A_{text{random}}}
]

The numerator tells us how significantly better our model is than random guessing.

The denominator tells us the utmost possible improvement over random.

The ratio puts these two together, so the Gini coefficient all the time falls between 0 (random) and 1 (perfect).

Gini is used to measure how close the model is to being perfect in separating positive and negative classes.

But we may get a doubt about why we calculated Gini and why we didn’t stop after 0.775.

0.775 is the realm under the Lorenz curve for our model. It doesn’t tell us how close the model is to being perfect without comparing it to 0.8, which is the realm for the right model.

So, we calculate Gini to standardize it in order that it falls between 0 and 1, which makes it easy to check models.

Banks also use Gini Coefficient to judge credit risk models alongside ROC-AUC and KS Statistic. Together, these measures give an entire picture of model performance.

Now, let’s calculate ROC-AUC for our sample data.

import pandas as pd
from sklearn.metrics import roc_auc_score

# Sample data
data = {
    "Actual": [2, 2, 2, 1, 2, 1, 1, 1, 1, 1],
    "Pred_Prob_Class2": [0.92, 0.63, 0.51, 0.39, 0.29, 0.20, 0.13, 0.10, 0.05, 0.01]
}

df = pd.DataFrame(data)

# Convert Actual: class 2 -> 1 (positive), class 1 -> 0 (negative)
y_true = (df["Actual"] == 2).astype(int)
y_score = df["Pred_Prob_Class2"]

# Calculate ROC-AUC
roc_auc = roc_auc_score(y_true, y_score)
roc_auc

We got AUC = 0.9583

Now, Gini = (2 * AUC) – 1 = (2 * 0.9583) – 1 = 0.92

That is the relation between Gini & ROC-AUC.

Now let’s calculate Gini Coefficient on a full dataset.

Code:

import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import roc_auc_score

# Load dataset
file_path = "C:/german.data"
data = pd.read_csv(file_path, sep=" ", header=None)

# Rename columns
columns = [f"col_{i}" for i in range(1, 21)] + ["target"]
data.columns = columns

# Features and goal
X = pd.get_dummies(data.drop(columns=["target"]), drop_first=True)
y = data["target"]

# Convert goal: make it binary (1 = good, 0 = bad)
y = (y == 2).astype(int)

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42, stratify=y
)

# Train logistic regression
model = LogisticRegression(max_iter=10000)
model.fit(X_train, y_train)

# Predicted probabilities
y_pred_proba = model.predict_proba(X_test)[:, 1]

# Calculate ROC-AUC
auc = roc_auc_score(y_test, y_pred_proba)

# Calculate Gini
gini = 2 * auc - 1

auc, gini

We got Gini = 0.60

Interpretation:

Gini > 0.5: acceptable.

Gini = 0.6–0.7: good model.

Gini = 0.8+: excellent, rarely achieved.

Dataset

The dataset utilized in this blog is the German Credit dataset, which is publicly available on the UCI Machine Learning Repository. It’s provided under the Creative Commons Attribution 4.0 International (CC BY 4.0) License. This implies it could possibly be freely used and shared with proper attribution.

I hope you found this blog useful.

In case you enjoyed reading, consider sharing it together with your network, and be at liberty to share your thoughts.

In case you haven’t read my earlier blogs on ROC-AUC and Kolmogorov Smirnov Statistic, you may check them out here.

Thanks for reading!

Beyond ROC-AUC and KS: The Gini Coefficient, Explained Simply

Why do we’ve multiple classification metrics?

Calculation

Step 1: Sort the information by predicted probabilities.

Step 2: Compute Cumulative Population and Cumulative Positives.

Step 3: Plot X and Y values

Step 4: Calculate the realm under the Lorenz curve.

Step 5: Calculate the Gini Coefficient

Interpretation:

Dataset

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

a Leaderboard for Real World Use Cases

Patch Time Series Transformer in Hugging Face

Constitutional AI with Open LLMs

Hugging Face Text Generation Inference available for AWS Inferentia2

The best way to Leverage Slash Commands to Code Effectively

Beyond ROC-AUC and KS: The Gini Coefficient, Explained Simply

Why do we’ve multiple classification metrics?

Calculation

Step 1: Sort the information by predicted probabilities.

Step 2: Compute Cumulative Population and Cumulative Positives.

Step 3: Plot X and Y values

Step 4: Calculate the realm under the Lorenz curve.

Step 5: Calculate the Gini Coefficient

Interpretation:

Dataset

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.