Is Your Model Time-Blind? The Case for Cyclical Feature Encoding

: The Midnight Paradox

Imagine this. You’re constructing a model to predict electricity demand or taxi pickups. So, you feed it time (corresponding to minutes) starting at midnight. Clean and easy. Right?

Now your model sees 23:59 (minute 1439 within the day) and 00:01 (minute 1 within the day). To you, they’re two minutes apart. To your model, they’re very far apart. That’s the midnight paradox. And yes, your model might be time-blind.

Why does this occur?

Because most machine learning models treat numbers as straight lines, not circles.

Linear regression, KNN, SVMs, and even neural networks will treat numbers logically, assuming higher numbers are “more” than lower ones. They don’t know that point wraps around. Midnight is the sting case they never forgive.

In case you’ve ever added hourly information to your model without success, wondering later why your model struggles around day boundaries, this is probably going why.

The Failure of Standard Encoding

Let’s talk concerning the usual approaches. You’ve probably used at the very least certainly one of them.

You encode hours as numbers from 0 to 23. Now there’s a synthetic cliff between hour 23 and hour 0. Thus, this model thinks midnight is the most important jump of the day. Nevertheless, is midnight really more different from 11 PM than 10 PM is from 9 PM?

After all not. But your model doesn’t know that.

Here’s the hours representation once they’re within the “linear” mode.

# Generate data
date_today = pd.to_datetime('today').normalize()
datetime_24_hours = pd.date_range(start=date_today, periods=24, freq='h')
df = pd.DataFrame({'dt': datetime_24_hours})
df['hour'] = df['dt'].dt.hour	

# Calculate Sin and Cosine
df["hour_sin"] = np.sin(2 * np.pi * df["hour"] / 24)
df["hour_cos"] = np.cos(2 * np.pi * df["hour"] / 24)

# Plot the Hours in Linear mode
plt.figure(figsize=(15, 5))
plt.plot(df['hour'], [1]*24, linewidth=3)
plt.title('Hours in Linear Mode')
plt.xlabel('Hour')
plt.xticks(np.arange(0, 24, 1))
plt.ylabel('Value')
plt.show()

Hours within the Linear Mode. Image by the creator.

What if we one-hot encode the hours? Twenty-four binary columns. Problem solved, right? Well… partially. You fixed the factitious gap, but you lost proximity. 2 AM is not any longer closer to three AM than to 10 PM.
You furthermore may exploded dimensionality. For trees, that’s annoying. For linear models, it’s probably inefficient.

So, let’s move on to a feasible alternative.

The Solution: Trigonometric Mapping

Here’s the mindset shift:

Stop desirous about time as a line. Give it some thought as a circle.

A 24-hour day loops back to itself. So your encoding should loop too, considering in circles. Each hour is an evenly spaced point on a circle. Now, to represent some extent on a circle, you don’t use one number, but as a substitute you employ two coordinates: and .

That’s where sine and cosine are available in.

The geometry behind it

Every angle on a circle could be mapped to a novel point using sine and cosine. This provides your model a smooth, continuous representation of time.

plt.figure(figsize=(5, 5))
plt.scatter(df['hour_sin'], df['hour_cos'], linewidth=3)
plt.title('Hours in Cyclical Mode')
plt.xlabel('Hour')

Hours in cyclcical mode after sine and cosine. Image by the creator.

Here’s the mathematics formula to calculate cycles for hours of the day:

First, 2 * π * hour / 24 converts each hour into an angle. Midnight and 11 PM find yourself almost at the identical position on the circle.
Then and project that angle into two coordinates.
Those two values together uniquely define the hour. Now 23:00 and 00:00 are close in feature space. Exactly what you wanted all along.

The identical idea works for minutes, days of the week, or months of the yr.

Code

Let’s experiment with this dataset [4]. We’ll try to enhance the prediction using a Random Forest Regressor model (a tree-based model).

Candanedo, L. (2017). Appliances Energy Prediction [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5VC8G. Creative Commons 4.0 License.

# Imports
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import root_mean_squared_error
from ucimlrepo import fetch_ucirepo

Get data.

# fetch dataset 
appliances_energy_prediction = fetch_ucirepo(id=374) 
  
# data (as pandas dataframes) 
X = appliances_energy_prediction.data.features 
y = appliances_energy_prediction.data.targets 
  
# To Pandas
df = pd.concat([X, y], axis=1)
df['date'] = df['date'].apply(lambda x: x[:10] + ' ' + x[11:])
df['date'] = pd.to_datetime(df['date'])
df['month'] = df['date'].dt.month
df['day'] = df['date'].dt.day
df['hour'] = df['date'].dt.hour
df.head(3)

Let’s create a fast model with the time first, as our baseline for comparison.

# X and y
# X = df.drop(['Appliances', 'rv1', 'rv2', 'date'], axis=1)
X = df[['hour', 'day', 'T1', 'RH_1', 'T_out', 'Press_mm_hg', 'RH_out', 'Windspeed', 'Visibility', 'Tdewpoint']]
y = df['Appliances']

# Train Test Split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Fit the model
lr = RandomForestRegressor().fit(X_train, y_train)

# Rating
print(f'Rating: {lr.rating(X_train, y_train)}')

# Test RMSE
y_pred = lr.predict(X_test)
rmse = root_mean_squared_error(y_test, y_pred)
print(f'RMSE: {rmse}')

The outcomes are here.

Rating: 0.9395797670166536
RMSE: 63.60964667197874

Next, we are going to encode the cyclical time components (day and hour) and retrain the model.

# Add cyclical hours sin and cosine
df['hour_sin'] = np.sin(2 * np.pi * df['hour'] / 24)
df['hour_cos'] = np.cos(2 * np.pi * df['hour'] / 24)
df['day_sin'] = np.sin(2 * np.pi * df['day'] / 31)
df['day_cos'] = np.cos(2 * np.pi * df['day'] / 31)

# X and y
X = df[['hour_sin', 'hour_cos', 'day_sin', 'day_cos','T1', 'RH_1', 'T_out', 'Press_mm_hg', 'RH_out', 'Windspeed', 'Visibility', 'Tdewpoint']]
y = df['Appliances']

# Train Test Split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Fit the model
lr_cycle = RandomForestRegressor().fit(X_train, y_train)

# Rating
print(f'Rating: {lr_cycle.rating(X_train, y_train)}')

# Test RMSE
y_pred = lr_cycle.predict(X_test)
rmse = root_mean_squared_error(y_test, y_pred)
print(f'RMSE: {rmse}')

And the outcomes. We’re seeing an improvement of 1% within the rating and 1 point within the RMSE.

Rating: 0.9416365489096074
RMSE: 62.87008070927842

I’m sure this doesn’t appear like much, but let’s keep in mind that this toy example is using an easy out-of-the-box model with none data treatment or cleanup. We’re seeing mostly the effect of the sine and cosine transformation.

What’s really happening here is that, in real life, electricity demand doesn’t reset at midnight. And now your model finally sees that continuity.

Why You Need Each Sine and Cosine

Don’t fall into the temptation of using only sine, because it feels enough. One column as a substitute of two. Cleaner, right?

Unfortunately, it breaks symmetry. On a 24-hour clock, 6 AM and 6 PM can produce the identical sine value. Different times with similar encoding could be bad since the model now confuses morning rush hour with evening rush hour. Thus, not ideal unless you enjoy confused predictions.

Using each sine and cosine fixes this. Together, they offer each hour a novel fingerprint on the circle. Consider it like latitude and longitude. You wish each to know where you might be.

Real-World Impact & Results

So, does this actually help models? Yes. Especially certain ones.

Distance-based models

KNN and SVMs rely heavily on distance calculations. Cyclical encoding prevents fake “long distances” at boundaries. Your neighbors actually change into neighbors again.

Neural networks

Neural networks learn faster with smooth feature spaces. Cyclical encoding removes sharp discontinuities at midnight. That typically means faster convergence and higher stability.

Tree-based models

Gradient Boosted Trees like XGBoost or LightGBM can eventually learn these patterns. Cyclical encoding gives them a head start. In case you care about performance and interpretability, it’s price it.

7. When Should You Use This?

All the time ask yourself the query: Does this feature repeat in a cycle? If yes, consider cyclical encoding.

Common examples are:

Hour of day
Day of week
Month of yr
Wind direction (degrees)
If it loops, you would possibly try encoding it like a loop.

Before You Go

Time will not be only a number. It’s a coordinate on a circle.

In case you treat it like a straight line, your model can stumble at boundaries and have a tough time understanding that variable as a cycle, something that repeats and has a pattern.

Cyclical encoding with sine and cosine fixes this elegantly, preserving proximity, reducing artifacts, and helping models learn faster.

So next time your predictions look weird around day changes, do that latest tool you’ve learned, and let it make your model shine because it should.

In case you liked this content, find more of my work and my contacts at my website.

https://gustavorsantos.me

GitHub Repository

Here’s the entire code of this exercise.

https://github.com/gurezende/Time-Series/tree/primary/Sine%20Cosine%20Time%20Encode

References & Further Reading

[1. Encoding hours Stack Exchange]: https://stats.stackexchange.com/questions/451295/encoding-cyclical-feature-minutes-and-hours

[2. NumPy trigonometric functions]: https://numpy.org/doc/stable/reference/routines.math.html

[3. Practical discussion on cyclical features]:
https://www.kaggle.com/code/avanwyk/encoding-cyclical-features-for-deep-learning

[4. Appliances Energy Prediction Dataset] https://archive.ics.uci.edu/dataset/374/appliances+energy+prediction

Is Your Model Time-Blind? The Case for Cyclical Feature Encoding

: The Midnight Paradox

The Failure of Standard Encoding

The geometry behind it

Code

Why You Need Each Sine and Cosine

Real-World Impact & Results

Distance-based models

Neural networks

Tree-based models

7. When Should You Use This?

Before You Go

GitHub Repository

References & Further Reading

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

4 Techniques to Optimize AI Coding Efficiency

structured generation in Rust and Python

Deploying Speech-to-Speech on Hugging Face

How AI coding agents work—and what to recollect in case you use them

Bonferroni vs. Benjamini-Hochberg: Selecting Your P-Value Correction

Is Your Model Time-Blind? The Case for Cyclical Feature Encoding

: The Midnight Paradox

The Failure of Standard Encoding

The geometry behind it

Code

Why You Need Each Sine and Cosine

Real-World Impact & Results

Distance-based models

Neural networks

Tree-based models

7. When Should You Use This?

Before You Go

GitHub Repository

References & Further Reading

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.