GPUs

Artificial Intelligence

AI in Multiple GPUs: ZeRO & FSDP

of a series about distributed AI across multiple GPUs: Introduction Within the previous post, we saw how Distributed Data Parallelism (DDP) hastens training by splitting batches across GPUs. DDP solves the throughput problem, however it...

ASK ANA - March 5, 2026

Artificial Intelligence

Recent method could increase LLM training efficiency

Reasoning large language models (LLMs) are designed to resolve complex problems by...

ASK ANA - February 26, 2026

Artificial Intelligence

AI in Multiple GPUs: Gradient Accumulation & Data Parallelism

is an element of a series about distributed AI across multiple GPUs: Introduction Distributed Data Parallelism (DDP) is the primary parallelization method we’ll have a look at. It’s the baseline approach that’s all the time utilized in...

ASK ANA - February 24, 2026

Artificial Intelligence

AI in Multiple GPUs: How GPUs Communicate

is a component of a series about distributed AI across multiple GPUs: Introduction Before diving into advanced parallelism techniques, we want to know the important thing technologies that enable GPUs to speak with one another. But why...

ASK ANA - February 19, 2026

Artificial Intelligence

AI in Multiple GPUs: Point-to-Point and Collective Operations

is an element of a series about distributed AI across multiple GPUs: Part 1: Understanding the Host and Device Paradigm Part 2: Point-to-Point and Collective Operations (this text) Part 3: How GPUs Communicate Part 4: Gradient Accumulation...

ASK ANA - February 13, 2026

Artificial Intelligence

AI in Multiple GPUs: Understanding the Host and Device Paradigm

is an element of a series about distributed AI across multiple GPUs: Part 1: Understanding the Host and Device Paradigm (this text) Part 2: Point-to-Point and Collective Operations Part 3: How GPUs Communicate Part 4: Gradient...

ASK ANA - February 12, 2026

Artificial Intelligence

Breaking the Hardware Barrier: Software FP8 for Older GPUs

As deep learning models grow larger and datasets expand, practitioners face an increasingly common bottleneck: GPU memory bandwidth. While cutting-edge hardware offers FP8 precision to speed up training and inference, most data scientists and...

ASK ANA - December 28, 2025

Artificial Intelligence

Open AI ‘Stargate 1’ invested 55 trillion in Oracle … 400,000 GPUs secured

Oracle has invested about $ 40 billion (about 55 trillion won) in an open AI dedicated data center under construction in Avilin, Texas, USA. This fund is used to buy 400,000 copies of NVIDIA's...

ASK ANA - May 25, 2025

12 3 Page 1 of 3

Popular categories

Artificial Intelligence10979 New Post1 My Blog1

GPUs

Recent posts

Agentic commerce runs on truth and context

Constructing Human-In-The-Loop Agentic Workflows

Augmenting citizen science with computer vision for fish monitoring

Scaling Token Factory Revenue and AI Efficiency by Maximizing Performance per Watt

This startup wants to alter how mathematicians do math

Popular categories