Empirical Mode Decomposition: The Most Intuitive Strategy to Decompose Complex Signals and Time Series

-

to investigate your time series as a knowledge scientist?
Have you ever ever wondered whether signal processing could make your life easier?

If yes — stick with me. This text is made for you. 🙂

Working with real-world time series will be… painful. Financial curves, ECG traces, neural signals: they often appear like chaotic spikes with no structure in any respect.

Working with real-world time series will be… painful. Financial curves, ECG traces, neural signals: they often appear like chaotic spikes with no structure in any respect.

Working with real-world time series will be… painful. Financial curves, ECG traces, neural signals: they often appear like chaotic spikes with no structure in any respect.

In data science, we are inclined to depend on classical statistical preprocessing: seasonal decomposition, detrending, smoothing, moving averages… These techniques are useful, but they arrive with strong assumptions which are rarely valid in practice. And when those assumptions fail, your machine learning model might underperform or not generalize.

Today, we’ll explore a family of methods which are rarely taught in data-science training, yet they will completely transform how you’re employed with time data.


On Today’s Menu 🍔

🍰 Why traditional methods struggle with real-world time series
🍛 How signal-processing tools might help
🍔 How Empirical Mode Decomposition (EMD) works and where it fails


The “classic” preprocessing techniques I discussed above are good starting points, but as i said they depend on fixed, defined assumptions about how a signal should behave.

Most of them assume that the signal is stationary, meaning its statistical properties (mean, variance, spectral content) stay constant over time.

But in point of fact, most real signals are:

  • non-stationary (their frequency content evolves)
  • non-linear (they can not be explained by easy additive components)
  • noisy
  • mixed with multiple oscillations directly

So… what exactly is a “signal”?

A signal is just any quantity that varies over time (what we often call a in data science).

Some examples:

  • ❤️ ECG or EEG — biomedical/brain signals
  • 🌋 Seismic activity — geophysics
  • 🖥️ CPU usage — system monitoring
  • 💹 Stock prices, volatility, order flow — finance
  • 🌦️ Temperature or humidity — climate science
  • 🎧 Audio waveforms — speech & sound evaluation
Figure 1: Example of Magnetoencephalography (MEG) signal data. (Image by creator)

Signals are in all places. And just about all of them violate the assumptions of classical time-series models.

They’re rarely “clean.” What i mean is that a single signal will likely be a combination of several processes happening at the identical time.

Inside one signal, you possibly can often find:

  • slow trends
  • periodic oscillations
  • short bursts
  • random noise
  • hidden rhythms you possibly can’t see directly

👉 Now imagine you could possibly separate all of those components — — without assuming stationarity, without specifying frequency bands, and without forcing the signal right into a predefined basis.

That’s the promise of data-driven signal decomposition.

This text is Part 1 of a 3-article series on adaptive decomposition:

  1. EMD — Empirical Mode Decomposition
  2. VMD — Variational Mode Decomposition
  3. MVMD — Multivariate VMD

Each method is more powerful and more stable than the previous one — and by the top of the series, you’ll understand how signal-processing methods can extract clean, interpretable components.

Empirical Mode Decomposition

Empirical Mode Decomposition was introduced by Huang et al. (1998) as a part of the Hilbert–Huang Transform.
Its goal is straightforward but powerful: take a signal and split it right into a set of unpolluted oscillatory components, called Intrinsic Mode Functions (IMFs).

Each IMF corresponds to an oscillation present in your signal, from the fastest to the slowest trends.

Take a have a look at Figure 2 below:
At the highest, you see the unique signal.
Below it, you see several IMFs — every one capturing a special “layer” of oscillation hidden inside the information.

IMF₁ incorporates the fastest variations
IMF₂ captures a rather slower rhythm

The last IMF + residual represent the slow trend or baseline

Some IMFs might be useful in your machine learning task; others may correspond to noise, artifacts, or irrelevant oscillations.

Figure 2: Original signal (top) and 5 IMFs (bottom), ordered from high-frequency to low-frequency components. (Image by creator)

What’s the Math behind EMD?

Any signal x(t) is decomposed by EMD as:

Where:

  • Ci(t) are the Intrinsic Mode Functions (IMFs)
  • IMF₁ captures the fastest oscillations
  • IMF₂ captures a slower oscillation, and so forth…
  • r(t) is the residual — the slow trend or baseline
  • Adding all IMFs + the residual reconstructs the unique signal exactly.

An IMF is a obtained directly from the information.
It must satisfy two easy properties:

  1. The variety of zero crossings ≈ the variety of extrema
    → The oscillation is well-behaved.
  2. The mean of the upper and lower envelopes is roughly zero
    → The oscillation is locally symmetric, with no long-term information.

These two rules make IMFs fundamentally data-driven and adaptive unlike Fourier or wavelets, which force the signal into predetermined shapes.

The intuition behind the EMD Algorithm

The EMD algorithm is surprisingly intuitive. Here’s the extraction loop:

  1. Start along with your signal
  2. Find all local maxima and minima
  3. Interpolate them to form an upper and a lower envelope
    (see Figure 3)
  4. Compute the mean of each envelopes
  5. Subtract this mean from the signal

This offers you a “candidate IMF.”

6. Then check the 2 IMF conditions:

  • Does it have the identical variety of zero crossings and extrema?
  • Is the mean of its envelopes roughly zero?

If yes → You’ve gotten extracted IMF₁.
If no → You repeat the method (called ) until it meets the standards.

7. When you obtain IMF₁ (the fastest oscillation):

  • You subtract it from the unique signal,
  • The rest becomes the latest signal,
  • And also you repeat the method to extract IMF₂, IMF₃, …

This continues until there isn’t any meaningful oscillation left.
What stays is the residual trend r(t).

Figure 3: One iteration of the EMD. Top: Original signal (blue). Middle: Upper and lower envelopes (red). Bottom: Local mean (black). (Image by creator)

EMD in Practice

To actually understand how EMD works, let’s create our own synthetic signal.

We’ll mix three components:

  • A low-frequency oscillation (around 5 Hz)
  • A high-frequency oscillation (around 30 Hz)
  • A little bit of random white noise

Once all the things is summed into one single messy signal, we’ll apply the EMD method.

import numpy as np
import matplotlib.pyplot as plt

# --- Parameters ---
Fs = 500         # Sampling frequency (Hz)
t_end = 2        # Duration in seconds
N = Fs * t_end   # Total variety of samples
t = np.linspace(0, t_end, N, endpoint=False)

# --- Components ---
# 1. Low-frequency component (Alpha-band equivalent)
f1 = 5
s1 = 2 * np.sin(2 * np.pi * f1 * t)

# 2. High-frequency component (Gamma-band equivalent)
f2 = 30
s2 = 1.5 * np.sin(2 * np.pi * f2 * t)

# 3. White noise
noise = 0.5 * np.random.randn(N)

# --- Composite Signal ---
signal = s1 + s2 + noise

# Plot the synthetic signal
plt.figure(figsize=(12, 4))
plt.plot(t, signal)
plt.title(f'Synthetic Signal (Components at {f1} Hz and {f2} Hz)')
plt.xlabel('Time (s)')
plt.ylabel('Amplitude')
plt.grid(True)
plt.tight_layout()
plt.show()
Figure 4: A Synthetic Signal Containing Multiple Frequencies. (Image by creator)

A crucial detail:

EMD robotically chooses the variety of IMFs.
It keeps decomposing the signal until a is reached — typically when:

  • no more oscillatory structure will be extracted
  • or the residual becomes a monotonic trend
  • or the sifting process stabilizes

(It’s also possible to set a maximum variety of IMFs if needed, however the algorithm naturally stops by itself.)

from PyEMD import EMD


# Initialize EMD
emd = EMD()
IMFs = emd.emd(signal, max_imf=10) 

# Plot Original Signal and IMFs

fig, axes = plt.subplots(IMFs.shape[0] + 1, 1, figsize=(10, 2 * IMFs.shape[0]))
fig.suptitle('EMD Decomposition Results', fontsize=14)

axes[0].plot(t, signal)
axes[0].set_title('Original Signal')
axes[0].set_xlim(t[0], t[-1])
axes[0].grid(True)

for n, imf in enumerate(IMFs):
    axes[n + 1].plot(t, imf, 'g')
    axes[n + 1].set_title(f"IMF {n+1}")
    axes[n + 1].set_xlim(t[0], t[-1])
    axes[n + 1].grid(True)

plt.tight_layout(rect=[0, 0.03, 1, 0.95])
plt.show()
Figure 5: EMD Decomposition of the Synthetic Signal. (Image by creator)

EMD Limitations

EMD is powerful, but it surely has several weaknesses:

  • Mode mixing: different frequencies can find yourself in the identical IMF.
  • Oversplitting: EMD decides the variety of IMFs by itself and may extract too many.
  • Noise sensitivity: small noise changes can completely alter the IMFs.
  • No solid mathematical foundation: results usually are not guaranteed to be stable or unique.

Due to these limitations, several improved versions exist (EEMD, CEEMDAN), but they continue to be empirical.

This is precisely why methods like VMD were created — and that is what we’ll explore in the subsequent article of this series.

ASK ANA

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x