Home Artificial Intelligence Time Series Prediction with Transformers

Time Series Prediction with Transformers

Time Series Prediction with Transformers

A Complete Guide to Transformers in Pytorch

At the most recent for the reason that advent of ChatGPT, Large Language models (LLMs) have created an enormous hype, and are known even to those outside the AI community. Despite the fact that one needs to know that LLMs inherently are “just” sequence prediction models with none type of intelligence or reasoning — the achieved results are actually extremely impressive, with some even talking about one other step within the “AI Revolution”.

Essential to the success of LLMs are their core constructing blocks, transformers. On this post, we’ll give an entire guide of using them in Pytorch, with particular give attention to time series prediction. Thanks for stopping by, and I hope you benefit from the ride!

Photo by Tim Meyer on Unsplash

One could argue that each one problems solved via transformers essentially are time series problems. While that’s true, here we’ll put special focus to continuous series and data — corresponding to predicting the spreading of diseases or forecasting the weather. The difference to the distinguished application of Natural Language Processing (NLP) simply (if this word is allowed on this context — developing a model like ChatGPT and making it work naturally does require a mess of further optimization steps and tricks) is the continual input space, while NLP works with discrete tokens. Nevertheless, aside from this, the essential constructing blocks are an identical.

On this post, we’ll start with a (short) theoretical introduction of transformers, after which move towards applying them in Pytorch. For this, we’ll discuss a specific example, namely predicting the sine function. We are going to show find out how to generate data for this and pre-process it appropriately, after which use transformers for learning find out how to predict this function. Later, we’ll discuss find out how to do inference when future tokens are usually not available, and conclude the post by extending the instance to multi-dimensional data.

Goal of this post is providing an entire hands-on tutorial on find out how to use transformers for real-world use cases — and never theoretically introducing and explaining these interesting models. As an alternative I’d prefer to discuss with this amazing article and the unique paper [1] (whose architecture we’ll follow throughout this…


  1. Fantastic site Lots of helpful information here I am sending it to some friends ans additionally sharing in delicious And of course thanks for your effort


Please enter your comment!
Please enter your name here