PyTorch

Learning Triton One Kernel At a Time: Vector Addition

, slightly optimisation goes a great distance. Models like GPT4 cost greater than $100 tens of millions to coach, which makes a 1% efficiency gain price. A robust strategy to optimise the efficiency of...

The Channel-Sensible Attention | Squeeze and Excitation

After we speak about attention in computer vision, one thing that probably involves your mind first is the one utilized in the Vision Transformer (ViT) architecture. Actually, that’s not the one attention mechanism we've...

The Crucial Role of NUMA Awareness in High-Performance Deep Learning

world of deep learning training, the role of the ML developer will be likened to that of the conductor of an orchestra. Just as a conductor must time the entry of every instrument...

The way to Tremendous-Tune Small Language Models to Think with Reinforcement Learning

in fashion. DeepSeek-R1, Gemini-2.5-Pro, OpenAI’s O-series models, Anthropic’s Claude, Magistral, and Qwen3 — there's a brand new one every month. Once you ask these models a matter, they go right into a ...

Pipelining AI/ML Training Workloads with CUDA Streams

ninth in our series on performance profiling and optimization in PyTorch aimed toward emphasizing the critical role of performance evaluation and optimization in machine learning development. Throughout the series we've reviewed a wide selection of practical...

A Caching Strategy for Identifying Bottlenecks on the Data Input Pipeline

in the info input pipeline of a machine learning model running on a GPU may be particularly frustrating. In most workloads, the host (CPU) and the device (GPU) work in tandem: the CPU...

Grad-CAM from Scratch with PyTorch Hooks

automobile stops suddenly. Worryingly, there isn't a stop check in sight. The engineers can only make guesses as to why the automobile’s neural network became confused. It might be a tumbleweed rolling across...

The Art of Noise

In my last several articles I talked about generative deep learning algorithms, which mostly are related to text generation tasks. So, I believe it will be interesting to change to generative algorithms for image...

Recent posts

Popular categories

ASK ANA