Advanced Time-Series Anomaly Detection with Deep Learning in PowerBI

-

Time-series data showing an anomalous spike.
Image by the Writer.

Introduction

With an ever-increasing variety of applications and services deployed worldwide, time-series anomaly detection has turn into a ubiquitous and indispensable tool for capturing metric regressions.

Nevertheless, organising an anomaly detection system is actually not straightforward and sometimes requires a good amount of domain expertise. In PowerBI, the set-up may be done in only just a few clicks, enabling the implementation of a state-of-the-art anomaly detection system very quickly.

This text will describe the revolutionary algorithm behind PowerBI’s anomaly detection functionality and supply a step-by-step approach on how it might be implemented and configured.

The algorithm: SR-CNN

Under the hood of PowerBI’s anomaly detection functionality lies a mechanism that mixes the spectral residual (SR) algorithm with a convolutional neural network (CNN) — hence its name, SR-CNN. Numerous big words here, I do know. So let’s unpack.

Inspired by computer vision, the authors of this methodology borrowed the SR algorithm from the visual saliency detection domain. The motivation behind this was a shared belief that visual saliency detection and time-series anomaly detection are quite similar, as anomalies are generally salient within the visual perspective.

But what exactly is visual saliency? It will possibly be described because the degree to which certain features in a picture, similar to contrast or edges, stand out and attract the eye of the human visual system. Below is a picture that illustrates this idea.

On the left, an image of cars showing brake lights. On the right, the corresponding saliency map.
Left: Photo by Musa Haef on Unsplash. Right: Spectral residual, computed by the Writer.

As expected, essentially the most salient areas of the image on the left are the cars’ brake lights — they immediately stand out to us and are quickly registered by our visual system. This can also be reflected within the SR saliency map on the fitting.

Briefly, the SR algorithm works as follows: (1) the Fourier Transform is used to acquire the log amplitude spectrum, (2) the SR is then calculated by subtracting the log amplitude spectrum from the averaged log amplitude spectrum, and (3) the Inverse Fourier Transform is applied to rework the sequence back to its spatial domain. The mathematical details are beyond the scope of this text, but may be present in the unique paper.

What now we have just applied within the visual domain can be applied within the time-series domain. Using sample data made available by Microsoft, we will implement the SR algorithm and transform the unique time-series data into its corresponding saliency map.

Top: original time-series data with anomaly. Bottom: spectral residual showing more pronounced anomaly.
Image by the Writer. License information for data usage: MIT License.

The underside graph, showing the SR-based saliency map, highlights the anomalous spike more clearly and makes it easier for us and — more importantly — for the anomaly detection algorithm to capture it.

Now on to the deep learning a part of SR-CNN. A CNN is applied directly on the outcomes of the SR model. More specifically, the authors trained a discriminative model on synthetic data, which was generated by introducing anomalous data points to a group of saliency maps. The usage of saliency maps here alleviates the issue of the shortage of labeled data, which might otherwise be required if the CNN were to be trained on raw inputs.

The composition of the CNN includes two 1-D convolutional layers and two fully connected layers, that are stacked before the sigmoid output. The authors used cross entropy loss and the SGD optimizer throughout the training process.

Using F1-score, precision, and recall, experimental results have shown that this approach mostly outperforms other unsupervised, state-of-the-art baselines similar to FFT, Twitter-AD, Luminol, DONUT, SPOT, and DSPOT.

Implementation in PowerBI

Here comes the only part. While the algorithm behind this anomaly detection method is reasonably sophisticated, its implementation really isn’t. Let’s take a have a look at the person steps you should take with the intention to apply this method in PowerBI, illustrated with examples:

Load your time-series data into PowerBI.

Create a Line Chart containing your time-series and be certain the X-axis type is ready to continuous.

Raw time-series data of revenue over time.
Image by the Writer.

Within the Visualizations pane, navigate to Add further analyses to your visual and activate Find anomalies.

Under Options, fine-tune the fundamental parameter: Sensitivity. The upper this parameter, the narrower the range of minimum and maximum expected values and thus the upper the number of information points that will likely be outside of this range and, in consequence, flagged as anomalies.

Example with sensitivity set to 80%:

Raw time-series data of revenue over time with detected anomaly.
Image by the Writer.

Example with sensitivity set to 98%:

Raw time-series data of revenue over time with detected anomalies. Number of anomalies detected are increased due to higher sensitivity.
Image by the Writer.

PowerBI also gives you the choice to customize the colour, marker, and size of the anomalous points in addition to the colour, style, and transparency of the expected range.

On top of just anomaly detection, PowerBI also provides possible explanations on why these regressions could have occurred. These are accompanied by a strength measure, which shows the degree to which a worth is correlated with the anomaly. This information may be retrieved by simply clicking on the anomaly in the road chart.

Possible explanations of anomaly, such as purchase size, reserved room type code, region, and age group.
Screenshot by the Writer.

If we click on the primary and strongest explanation, ‘Purchase Size’ is $30 — $40, we will see a powerful correlation with revenue on the date when the anomaly occurred, which could have contributed to this sudden spike.

Correlation between revenue and purchase size for $30 — $40.
Screenshot by the Writer.

Conclusion

This text demonstrates how a reasonably sophisticated time-series anomaly detection algorithm, inspired by computer vision, can easily and quickly be implemented and customised in PowerBI in only just a few clicks. Through various layers of abstraction, this method requires the fine-tuning of just one single, intuitive parameter: sensitivity. Lastly, the user can extract explanatory information on the anomalies detected by simply clicking on the anomalous data points in the road chart, which provides guidance on the potential root-cause of those unexpected spikes.

ASK DUKE

What are your thoughts on this topic?
Let us know in the comments below.

0 0 votes
Article Rating
guest
0 Comments
Inline Feedbacks
View all comments

Share this article

Recent posts

0
Would love your thoughts, please comment.x
()
x