PyTorch

Optimizing Data Transfer in AI/ML Workloads

a , a deep learning model is executed on a dedicated GPU accelerator using input data batches it receives from a CPU host. Ideally, the GPU — the dearer resource — needs to...

Optimizing PyTorch Model Inference on AWS Graviton

AI/ML models will be an especially expensive endeavor. A lot of our posts have been focused on a wide range of suggestions, tricks, and techniques for analyzing and optimizing the runtime performance of AI/ML workloads....

Optimizing PyTorch Model Inference on CPU

grows, so does the criticality of optimizing their runtime performance. While the degree to which AI models will outperform human intelligence stays a heated topic of debate, their need for powerful and expensive...

On the Challenge of Converting TensorFlow Models to PyTorch

Within the interest of managing reader expectations and stopping disappointment, we would love to start by stating that this post does not provide a totally satisfactory solution to the issue described within the title. We are...

Overcoming the Hidden Performance Traps of Variable-Shaped Tensors: Efficient Data Sampling in PyTorch

is the a part of a series of posts on the subject of analyzing and optimizing PyTorch models. Throughout the series, we have now advocated for using the PyTorch Profiler in AI model development and demonstrated the...

PyTorch Tutorial for Beginners: Construct a Multiple Regression Model from Scratch

before LLMs became hyped, there was an separating Machine Learning frameworks from Deep Learning frameworks. The talk was targeting Scikit-Learn, XGBoost, and similar for ML, while PyTorch and TensorFlow dominated the scene...

MobileNetV3 Paper Walkthrough: The Tiny Giant Getting Even Smarter

Welcome back to the Tiny Giant series — a series where I share what I learned about MobileNet architectures. Up to now two articles I covered MobileNetV1 and MobileNetV2. Take a look at references ...

Learning Triton One Kernel At a Time: Vector Addition

, slightly optimisation goes a great distance. Models like GPT4 cost greater than $100 tens of millions to coach, which makes a 1% efficiency gain price. A robust strategy to optimise the efficiency of...

Recent posts

Popular categories

ASK ANA