It’s well that we eat matters — but what if and we eat matters just as much?
Within the midst of ongoing scientific debate around the advantages of intermittent fasting, this query becomes much more intriguing. As someone enthusiastic about machine learning and healthy living, I used to be inspired by a 2017 research paper[1] exploring this intersection. The authors introduced a novel distance metric called Modified Dynamic Time Warping (MDTW) — a way designed to account not just for the dietary content of meals but additionally their timing throughout the day.
Motivated by their work[1], I built a full implementation of MDTW from scratch using Python. I applied it to cluster simulated individuals into temporal dietary patterns, uncovering distinct behaviors like skippers, snackers, and night eaters.
While MDTW may sound like a distinct segment metric, it fills a critical gap in time-series comparison. Traditional distance measures — equivalent to Euclidean distance and even classical Dynamic Time Warping (DTW) — struggle when applied to dietary data. People don’t eat at fixed times or with consistent frequency. They skip meals, snack irregularly, or eat late at night.
MDTW is designed for exactly this sort of temporal misalignment and behavioral variability. By allowing flexible alignment while penalizing mismatches in each nutrient content and meal timing, MDTW reveals subtle but meaningful differences in how people eat.
What this text covers:
- Mathematical foundation of MDTW — explained intuitively.
- From formula to code — implementing MDTW in Python with dynamic programming.
- Generating synthetic dietary data to simulate real-world eating behavior.
- Constructing a distance matrix between individual eating records.
- Clustering individuals with K-Medoids and evaluating with silhouette and elbow methods.
- Visualizing clusters as scatter plots and joint distributions.
- Interpreting temporal patterns from clusters: who eats when and the way much?
Quick Note on Classical Dynamic Time Warping (DTW)
Dynamic Time Warping (DTW) is a classic algorithm used to measure similarity between two sequences which will vary in length or timing. It’s widely utilized in speech recognition, gesture evaluation, and time series alignment. Let’s see a quite simple example of the Sequence A is aligned to Sequence B (shifted version of B) with using traditional dynamic time warping algorithm using library. As input, we give a distance metric as Euclidean. Also, we put time series to calculate the gap between these time series and optimized aligned path.
import numpy as np
import matplotlib.pyplot as plt
from fastdtw import fastdtw
from scipy.spatial.distance import euclidean
# Sample sequences (scalar values)
x = np.linspace(0, 3 * np.pi, 30)
y1 = np.sin(x)
y2 = np.sin(x+0.5) # Shifted version
# Convert scalars to vectors (1D)
y1_vectors = [[v] for v in y1]
y2_vectors = [[v] for v in y2]
# Use absolute distance for scalars
distance, path = fastdtw(y1_vectors, y2_vectors, dist=euclidean)
#or for scalar
# distance, path = fastdtw(y1, y2, dist=lambda x, y: np.abs(x-y))
distance, path = fastdtw(y1, y2,dist=lambda x, y: np.abs(x-y))
# Plot the alignment
plt.figure(figsize=(10, 4))
plt.plot(y1, label='Sequence A (slow)')
plt.plot(y2, label='Sequence B (shifted)')
# Draw alignment lines
for (i, j) in path:
plt.plot([i, j], [y1[i], y2[j]], color='gray', linewidth=0.5)
plt.title(f'Dynamic Time Warping Alignment (Distance = {distance:.2f})')
plt.xlabel('Time Index')
plt.legend()
plt.tight_layout()
plt.savefig('dtw_alignment.png')
plt.show()
The trail returned by fastdtw
(or any DTW algorithm) is a sequence of index pairs (i, j)
that represent the optimal alignment between two time series. Each pair indicates that element A[i]
is matched with B[j]
. By summing the distances between all these matched pairs, the algorithm computes the optimized cumulative cost — the minimum total distance required to warp one sequence to the opposite.

Modified Dynamic Warping
The important thing challenge when applying dynamic time warping (DTW) to dietary data (vs. easy examples like sine waves or fixed-length sequences) lies within the complexity and variability of real-world eating behaviors. Some challenges and the proposed solution within the paper[1] as a response to every challenge are as follows:
- Irregular Time Steps: MDTW accounts for this by explicitly incorporating the time difference in the gap function.
- Multidimensional Nutrients: MDTW supports multidimensional vectors to represent nutrients equivalent to calories, fat etc. and uses a weight matrix to handle differing units and the importance of nutrients,
- Unequal variety of meals: MDTW allows for matching with empty eating events, penalizing skipped or unmatched meals appropriately.
- Time Sensitivity: MDTW includes a time difference penalty, weighting eating events far apart in time even when the nutrients are similar.
Eating Occasion Data Representation
In keeping with the modified dynamic time warping proposed within the paper[1], everyone’s food regimen could be considered a sequence of eating events, where each event has:

For example how eating records appear in real data, I created three synthetic dietary profiles only considering calorie consumption — Skipper, Night Eater, and Snacker. Let’s assume if we ingest the raw data from an API on this format:
skipper={
'person_id': 'skipper_1',
'records': [
{'time': 12, 'nutrients': [300]}, # Skipped breakfast, large lunch
{'time': 19, 'nutrients': [600]}, # Large dinner
]
}
night_eater={
'person_id': 'night_eater_1',
'records': [
{'time': 9, 'nutrients': [150]}, # Light breakfast
{'time': 14, 'nutrients': [250]}, # Small lunch
{'time': 22, 'nutrients': [700]}, # Large late dinner
]
}
snacker= {
'person_id': 'snacker_1',
'records': [
{'time': 8, 'nutrients': [100]}, # Light morning snack
{'time': 11, 'nutrients': [150]}, # Late morning snack
{'time': 14, 'nutrients': [200]}, # Afternoon snack
{'time': 17, 'nutrients': [100]}, # Early evening snack
{'time': 21, 'nutrients': [200]}, # Night snack
]
}
raw_data = [skipper, night_eater, snacker]
As suggested within the paper, the dietary values must be normalized by the overall calorie consumptions.
import numpy as np
import matplotlib.pyplot as plt
def create_time_series_plot(data,save_path=None):
plt.figure(figsize=(10, 5))
for person,record in data.items():
#in case the nutrient vector has multiple dimension
data=[[time, float(np.mean(np.array(value)))] for time,value in record.items()]
time = [item[0] for item in data]
nutrient_values = [item[1] for item in data]
# Plot the time series
plt.plot(time, nutrient_values, label=person, marker='o')
plt.title('Time Series Plot for Nutrient Data')
plt.xlabel('Time')
plt.ylabel('Normalized Nutrient Value')
plt.legend()
plt.grid(True)
if save_path:
plt.savefig(save_path)
def prepare_person(person):
# Check if all nutrients have same length
nutrients_lengths = [len(record['nutrients']) for record in person["records"]]
if len(set(nutrients_lengths)) != 1:
raise ValueError(f"Inconsistent nutrient vector lengths for person {person['person_id']}.")
sorted_records = sorted(person["records"], key=lambda x: x['time'])
nutrients = np.stack([np.array(record['nutrients']) for record in sorted_records])
total_nutrients = np.sum(nutrients, axis=0)
# Check to avoid division by zero
if np.any(total_nutrients == 0):
raise ValueError(f"Zero total nutrients for person {person['person_id']}.")
normalized_nutrients = nutrients / total_nutrients
# Return a dictionary {time: [normalized nutrients]}
person_dict = {
record['time']: normalized_nutrients[i].tolist()
for i, record in enumerate(sorted_records)
}
return person_dict
prepared_data = {person['person_id']: prepare_person(person) for person in raw_data}
create_time_series_plot(prepared_data)

Calculation Distance of Pairs
The computation of distance measure between pair of people are defined within the formula below. The primary term represent an Euclidean distance of nutrient vectors whereas the second takes under consideration the time penalty.

This formula is implemented within the local_distance
function with the suggested values:
import numpy as np
def local_distance(eo_i, eo_j,delta=23, beta=1, alpha=2):
"""
Calculate the local distance between two events.
Args:
eo_i (tuple): Event i (time, nutrients).
eo_j (tuple): Event j (time, nutrients).
delta (float): Time scaling factor.
beta (float): Weighting factor for time difference.
alpha (float): Exponent for time difference scaling.
Returns:
float: Local distance.
"""
ti, vi = eo_i
tj, vj = eo_j
vi = np.array(vi)
vj = np.array(vj)
if vi.shape != vj.shape:
raise ValueError("Mismatch in feature dimensions.")
if np.any(vi < 0) or np.any(vj < 0):
raise ValueError("Nutrient values must be non-negative.")
if np.any(vi>1 ) or np.any(vj>1):
raise ValueError("Nutrient values should be within the range [0, 1].")
W = np.eye(len(vi)) # Assume W = identity for now
value_diff = (vi - vj).T @ W @ (vi - vj)
time_diff = (np.abs(ti - tj) / delta) ** alpha
scale = 2 * beta * (vi.T @ W @ vj)
distance = value_diff + scale * time_diff
return distance
We construct a neighborhood distance matrix (,) for every pair of people being compared. The variety of rows and columns on this matrix corresponds to the variety of eating occasions for every individual.
Once the local distance matrix deo(i,j) is constructed — capturing the pairwise distances between all eating occasions of two individuals — the following step is to compute the global cost matrix dER(i,j). This matrix accumulates the minimal alignment cost by considering three possible transitions at each step: matching two eating occasions, skipping an occasion in the primary record (aligning to an empty), or skipping an occasion within the second record.

To compute the overall distance between two sequences of eating occasions, we construct:
A local distance matrix deo
filled using local_distance
.
- A global cost matrix
dER
using dynamic programming, minimizing over: - Match
- Skip in the primary sequence (align to empty)
- Skip within the second sequence
These directly implement the reoccurrence:
import numpy as np
def mdtw_distance(ER1, ER2, delta=23, beta=1, alpha=2):
"""
Calculate the modified DTW distance between two sequences of events.
Args:
ER1 (list): First sequence of events (time, nutrients).
ER2 (list): Second sequence of events (time, nutrients).
delta (float): Time scaling factor.
beta (float): Weighting factor for time difference.
alpha (float): Exponent for time difference scaling.
Returns:
float: Modified DTW distance.
"""
m1 = len(ER1)
m2 = len(ER2)
# Local distance matrix including matching with empty
deo = np.zeros((m1 + 1, m2 + 1))
for i in range(m1 + 1):
for j in range(m2 + 1):
if i == 0 and j == 0:
deo[i, j] = 0
elif i == 0:
tj, vj = ER2[j-1]
deo[i, j] = np.dot(vj, vj)
elif j == 0:
ti, vi = ER1[i-1]
deo[i, j] = np.dot(vi, vi)
else:
deo[i, j]=local_distance(ER1[i-1], ER2[j-1], delta, beta, alpha)
# # Global cost matrix
dER = np.zeros((m1 + 1, m2 + 1))
dER[0, 0] = 0
for i in range(1, m1 + 1):
dER[i, 0] = dER[i-1, 0] + deo[i, 0]
for j in range(1, m2 + 1):
dER[0, j] = dER[0, j-1] + deo[0, j]
for i in range(1, m1 + 1):
for j in range(1, m2 + 1):
dER[i, j] = min(
dER[i-1, j-1] + deo[i, j], # Match i and j
dER[i-1, j] + deo[i, 0], # Match i to empty
dER[i, j-1] + deo[0, j] # Match j to empty
)
return dER[m1, m2] # Return the ultimate cost
ERA = list(prepared_data['skipper_1'].items())
ERB = list(prepared_data['night_eater_1'].items())
distance = mdtw_distance(ERA, ERB)
print(f"Distance between skipper_1 and night_eater_1: {distance}")
From Pairwise Comparisons to a Distance Matrix
Once we define how you can calculate the gap between two individuals’ eating patterns using MDTW, the following natural step is to compute distances across the entire dataset. To do that, we construct a distance matrix where each entry (i,j) represents the MDTW distance between person i and person j.
That is implemented within the function below:
import numpy as np
def calculate_distance_matrix(prepared_data):
"""
Calculate the gap matrix for the prepared data.
Args:
prepared_data (dict): Dictionary containing prepared data for everyone.
Returns:
np.ndarray: Distance matrix.
"""
n = len(prepared_data)
distance_matrix = np.zeros((n, n))
# Compute pairwise distances
for i, (id1, records1) in enumerate(prepared_data.items()):
for j, (id2, records2) in enumerate(prepared_data.items()):
if i < j: # Only upper triangle
print(f"Calculating distance between {id1} and {id2}")
ER1 = list(records1.items())
ER2 = list(records2.items())
distance_matrix[i, j] = mdtw_distance(ER1, ER2)
distance_matrix[j, i] = distance_matrix[i, j] # Symmetric matrix
return distance_matrix
def plot_heatmap(matrix,people_ids,save_path=None):
"""
Plot a heatmap of the gap matrix.
Args:
matrix (np.ndarray): The space matrix.
title (str): The title of the plot.
save_path (str): Path to avoid wasting the plot. If None, the plot won't be saved.
"""
plt.figure(figsize=(8, 6))
plt.imshow(matrix, cmap='hot', interpolation='nearest')
plt.colorbar()
plt.xticks(ticks=range(len(matrix)), labels=people_ids)
plt.yticks(ticks=range(len(matrix)), labels=people_ids)
plt.xticks(rotation=45)
plt.yticks(rotation=45)
if save_path:
plt.savefig(save_path)
plt.title('Distance Matrix Heatmap')
distance_matrix = calculate_distance_matrix(prepared_data)
plot_heatmap(distance_matrix, list(prepared_data.keys()), save_path='distance_matrix.png')
After computing the pairwise Modified Dynamic Time Warping (MDTW) distances, we will visualize the similarities and differences between individuals’ dietary patterns using a heatmap. Each cell (i,j) within the matrix represents the MDTW distance between person i and person j— lower values indicate more similar temporal eating profiles.
This heatmap offers a compact and interpretable view of dietary dissimilarities, making it easier to discover clusters of comparable eating behaviors.
This means that skipper_1
shares more similarity with night_eater_1
than with snacker_1
. The rationale is that each skipper and night eater have fewer, larger meals concentrated later within the day, while the snacker distributes smaller meals more evenly across your entire timeline.

Clustering Temporal Dietary Patterns
After calculating the pairwise distances using Modified Dynamic Time Warping (MDTW), we’re left with a distance matrix that reflects how dissimilar each individual’s eating pattern is from the others. But this matrix alone doesn’t tell us much at a look — to disclose structure in the information, we want to go one step further.
Before applying any Clustering Algorithm, we first need a dataset that reflects realistic dietary behaviors. Since access to large-scale dietary intake datasets could be limited or subject to usage restrictions, I generated synthetic eating event records that simulate diverse each day patterns. Each record represents an individual’s calorie intake at specific hours throughout a 24-hour period.
import numpy as np
def generate_synthetic_data(num_people=5, min_meals=1, max_meals=5,min_calories=200,max_calories=800):
"""
Generate synthetic data for a given number of individuals.
Args:
num_people (int): Number of individuals to generate data for.
min_meals (int): Minimum variety of meals per person.
max_meals (int): Maximum variety of meals per person.
min_calories (int): Minimum calories per meal.
max_calories (int): Maximum calories per meal.
Returns:
list: List of dictionaries containing synthetic data for everyone.
"""
data = []
np.random.seed(42) # For reproducibility
for person_id in range(1, num_people + 1):
num_meals = np.random.randint(min_meals, max_meals + 1) # random variety of meals between min and max
meal_times = np.sort(np.random.selection(range(24), num_meals, replace=False)) # random times sorted
raw_calories = np.random.randint(min_calories, max_calories, size=num_meals) # random calories between min and max
person_record = {
'person_id': f'person_{person_id}',
'records': [
{'time': float(time), 'nutrients': [float(cal)]} for time, cal in zip(meal_times, raw_calories)
]
}
data.append(person_record)
return data
raw_data=generate_synthetic_data(num_people=1000, min_meals=1, max_meals=5,min_calories=200,max_calories=800)
prepared_data = {person['person_id']: prepare_person(person) for person in raw_data}
distance_matrix = calculate_distance_matrix(prepared_data)
Selecting the Optimal Variety of Clusters
To find out the suitable variety of clusters for grouping dietary patterns, I evaluated two popular methods: the Elbow Method and the Silhouette Rating.
- The Elbow Method analyzes the clustering cost (inertia) because the variety of clusters increases. As shown within the plot, the associated fee decreases sharply as much as 4 clusters, after which the speed of improvement slows significantly. This “elbow” suggests diminishing returns beyond 4 clusters.
- The Silhouette Rating, which measures how well each object lies inside its cluster, showed a comparatively high rating at 4 clusters (≈0.50), even when it wasn’t absolutely the peak.

The next code computes the clustering cost and silhouette scores for various values of (variety of clusters), using the K-Medoids algorithm and a precomputed distance matrix derived from the MDTW metric:
from sklearn.metrics import silhouette_score
from sklearn_extra.cluster import KMedoids
import matplotlib.pyplot as plt
costs = []
silhouette_scores = []
for k in range(2, 10):
model = KMedoids(n_clusters=k, metric='precomputed', random_state=42)
labels = model.fit_predict(distance_matrix)
costs.append(model.inertia_)
rating = silhouette_score(distance_matrix, model.labels_, metric='precomputed')
silhouette_scores.append(rating)
# Plot
ks = list(range(2, 10))
fig, ax1 = plt.subplots(figsize=(8, 5))
color1 = 'tab:blue'
ax1.set_xlabel('Variety of Clusters (k)')
ax1.set_ylabel('Cost (Inertia)', color=color1)
ax1.plot(ks, costs, marker='o', color=color1, label='Cost')
ax1.tick_params(axis='y', labelcolor=color1)
# Create a second y-axis that shares the identical x-axis
ax2 = ax1.twinx()
color2 = 'tab:red'
ax2.set_ylabel('Silhouette Rating', color=color2)
ax2.plot(ks, silhouette_scores, marker='s', color=color2, label='Silhouette Rating')
ax2.tick_params(axis='y', labelcolor=color2)
# Optional: mix legends
lines1, labels1 = ax1.get_legend_handles_labels()
lines2, labels2 = ax2.get_legend_handles_labels()
ax1.legend(lines1 + lines2, labels1 + labels2, loc='upper right')
ax1.vlines(x=4, ymin=min(costs), ymax=max(costs), color='gray', linestyle='--', linewidth=0.5)
plt.title('Cost and Silhouette Rating vs Variety of Clusters')
plt.tight_layout()
plt.savefig('clustering_metrics_comparison.png')
plt.show()
Interpreting the Clustered Dietary Patterns
Once the optimal variety of clusters (k=4) was determined, each individual within the dataset was assigned to one in every of these clusters using the K-Medoids model. Now, we want to grasp what characterizes each cluster.
To achieve this, I followed the approach suggested in the unique MDTW paper [1]: analyzing the largest eating occasion for each individual, defined by each the time of day it occurred and the fraction of total each day intake it represented. This provides insight into people eat essentially the most calories and they eat during that peak occasion.
# Kmedoids clustering with the optimal variety of clusters
from sklearn_extra.cluster import KMedoids
import seaborn as sns
import pandas as pd
k=4
model = KMedoids(n_clusters=k, metric='precomputed', random_state=42)
labels = model.fit_predict(distance_matrix)
# Find the time and fraction of their largest eating occasion
def get_largest_event(record):
total = sum(v[0] for v in record.values())
largest_time, largest_value = max(record.items(), key=lambda x: x[1][0])
fractional_value = largest_value[0] / total if total > 0 else 0
return largest_time, fractional_value
# Create a largest meal data per cluster
data_per_cluster = {i: [] for i in range(k)}
for i, person_id in enumerate(prepared_data.keys()):
cluster_id = labels[i]
t, v = get_largest_event(prepared_data[person_id])
data_per_cluster[cluster_id].append((t, v))
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
# Convert to pandas DataFrame
rows = []
for cluster_id, values in data_per_cluster.items():
for hour, fraction in values:
rows.append({"Hour": hour, "Fraction": fraction, "Cluster": f"Cluster {cluster_id}"})
df = pd.DataFrame(rows)
plt.figure(figsize=(10, 6))
sns.scatterplot(data=df, x="Hour", y="Fraction", hue="Cluster", palette="tab10")
plt.title("Eating Events Across Clusters")
plt.xlabel("Hour of Day")
plt.ylabel("Fraction of Day by day Intake (largest meal)")
plt.grid(True)
plt.tight_layout()
plt.show()

While the scatter plot offers a broad overview, a more detailed understanding of every cluster’s eating behavior could be gained by examining their joint distributions.
By plotting the joint histogram of the hour and fraction of each day intake for the biggest meal, we will discover characteristic patterns, using the code below:
# Plot each cluster using seaborn.jointplot
for cluster_label in df['Cluster'].unique():
cluster_data = df[df['Cluster'] == cluster_label]
g = sns.jointplot(
data=cluster_data,
x="Hour",
y="Fraction",
kind="scatter",
height=6,
color=sns.color_palette("deep")[int(cluster_label.split()[-1])]
)
g.fig.suptitle(cluster_label, fontsize=14)
g.set_axis_labels("Hour of Day", "Fraction of Day by day Intake (largest meal)", fontsize=12)
g.fig.tight_layout()
g.fig.subplots_adjust(top=0.95) # adjust title spacing
plt.show()

To know how individuals were distributed across clusters, I visualized the number of individuals assigned to every cluster. The bar plot below shows the frequency of people grouped by their temporal dietary pattern. This helps assess whether certain eating behaviors — equivalent to skipping meals, late-night eating, or frequent snacking — are more prevalent within the population.

Based on the joint distribution plots, distinct temporal dietary behaviors emerge across clusters:
Cluster 0 (Flexible or Irregular Eater) reveals a broad dispersion of the biggest eating occasions across each the 24-hour day and the fraction of each day caloric intake.
Cluster 1 (Frequent Light Eaters) displays a more evenly distributed eating pattern, where no single eating occasion exceeds 30% of the overall each day intake, reflecting frequent but smaller meals throughout the day. That is the cluster that almost definitely represents “normal eaters” — those that eat three relatively balanced meals spread throughout the day. That is due to low variance in timing and fraction per eating event.
Cluster 2 (Early Heavy Eaters) is defined by a really distinct and consistent pattern: individuals on this group eat almost their entire each day caloric intake (near 100%) in a single meal, predominantly throughout the early hours of the day (midnight to noon).
Cluster 3 (Late Night Heavy Eaters) is characterised by individuals who eat nearly all of their each day calories in a single meal throughout the late evening or night hours (between 6 PM and midnight). Like Cluster 2, this group exhibits a unimodal eating pattern with a very high fractional intake (~1.0), indicating that the majority members eat once per day, but unlike Cluster 2, their eating window is significantly delayed.
CONCLUSION
On this project, I explored how Modified Dynamic Time Warping (MDTW) may help uncover temporal dietary patterns — focusing not only on what we eat, but when and how much. Using synthetic data to simulate realistic eating behaviors, I demonstrated how MDTW can cluster individuals into distinct profiles like irregular or flexible eaters, frequent light eaters, early heavy eaters and later night eaters based on the timing and magnitude of their meals.
While the outcomes show that MDTW combined with K-Medoids can reveal meaningful patterns in eating behaviors, this approach isn’t without its challenges. Because the dataset was synthetically generated and clustering was based on a single initialization, there are several caveats price noting:
- The clusters appear messy, possibly since the synthetic data lacks strong, naturally separable patterns — especially if meal times and calorie distributions are too uniform.
- Some clusters overlap significantly, particularly Cluster 0 and Cluster 1, making it harder to differentiate between truly different behaviors.
- Without labeled data or expected ground truth, evaluating cluster quality is difficult. A possible improvement can be to inject known patterns into the dataset to check whether the clustering algorithm can reliably recuperate them.
Despite these limitations, this work shows how a nuanced distance metric — designed for irregular, real-life patterns — can surface insights traditional tools may overlook. The methodology could be prolonged to personalized health monitoring, or any domain where when things occur matters just as much as what happens.
I’d love to listen to your thoughts on this project — whether it’s feedback, questions, or ideas for where MDTW might be applied next. This could be very much a piece in progress, and I’m all the time excited to learn from others.
For those who found this handy, have ideas for improvements, or wish to collaborate, be happy to open a problem or send a Pull Request on GitHub. Contributions are greater than welcome!
Thanks a lot for reading all of the strategy to the tip — it really means so much.
Code on GitHub : https://github.com/YagmurGULEC/mdtw-time-series-clustering
REFERENCES
[1] Khanna, Nitin, et al. “Modified dynamic time warping (MDTW) for estimating temporal dietary patterns.” . IEEE, 2017.