Detecting data drift to observe ML models in production (using Evidently library in Python) What’s data drift and why should we detect that ? Tips on how to detect data drift ? Drift detection in Python using Evidently How Evidently works ? Using Evidently to detect Data Drift (in Python) Customizations in Evidently

-

Data drift resulting in mispredictions in production
Evidently’s default logic to detect data drift (Image by writer)
How drift is detected using each test in Evidently (Image by writer)
!pip install evidently
from sklearn import datasets
import pandas as pd

from evidently.dashboard import Dashboard
from evidently.dashboard.tabs import DataDriftTab
from evidently.options import DataDriftOptions

iris = datasets.load_iris()
iris_frame = pd.DataFrame(iris.data, columns=iris.feature_names)
iris_frame['target'] = iris.goal
data_drift_dashboard = Dashboard(tabs=[DataDriftTab()])
data_drift_dashboard.calculate(iris_frame[:100], iris_frame[100:])
data_drift_dashboard.show(mode='inline')
Data drift dashboard
expansion of knowledge drift dashboard to envision the drift of individual feature (Shows that the sepal width feature isn’t drifted)
expansion of knowledge drift dashboard to envision the drift of individual features (Shows how the petal width feature in the present dataset is drifted from the reference dataset)
from evidently.options import DataDriftOptions

options = DataDriftOptions(all_features_stattest="jensenshannon", threshold=0.6)
data_drift_dashboard = Dashboard(tabs=[DataDriftTab()], options=[options])
data_drift_dashboard.calculate(iris_frame[:100], iris_frame[100:])
data_drift_dashboard.show(mode='inline')

Customized data drift detection dashboard
options = DataDriftOptions(num_features_stattest="psi", threshold=0.25)
data_drift_dashboard = Dashboard(tabs=[DataDriftTab()], options=[options])
data_drift_dashboard.calculate(iris_frame[:100], iris_frame[100:])
data_drift_dashboard.show(mode='inline')
options = DataDriftOptions(cat_features_stattest="psi", threshold=0.25)
data_drift_dashboard = Dashboard(tabs=[DataDriftTab()], options=[options])
data_drift_dashboard.calculate(iris_frame[:100], iris_frame[100:])
data_drift_dashboard.show(mode='inline')

ASK DUKE

What are your thoughts on this topic?
Let us know in the comments below.

2 COMMENTS

0 0 votes
Article Rating
guest
2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

2
0
Would love your thoughts, please comment.x
()
x