Gen AI Super-Resolution Accelerates Weather Prediction with Scalable, Low-Compute Models

As AI weather and climate prediction models rapidly gain adoption, the NVIDIA Earth-2 platform provides libraries and tools for accelerating solutions using a GPU-optimized software stack. Downscaling, which is the duty of refining coarse-resolution (25km scale) weather data, enables national meteorological service (NMS) agencies to deliver high-resolution predictions for agriculture, energy, transportation, and disaster preparedness at spatial resolutions effective enough for actionable decision-making and planning.

Traditional dynamical downscaling is prohibitively expensive, especially for big ensembles at high resolution and over extensive spatial domains. CorrDiff, a generative AI downscaling model that sidesteps computational bottlenecks of traditional numerical methods, achieves state-of-the-art results, and uses a patch-based multidiffusion approach to scale to continental and global domains. This AI-based solution unlocks significant gains in efficiency and scalability in comparison with traditional numerical methods, while greatly reducing computational costs.

CorrDiff has gained global adoption for various use cases, demonstrating its versatility and impact across domains where fine-scale weather information is crucial:

The Weather Company (TWC) for supporting the agriculture, energy, and aviation industries.
G42 for improving smog and dirt storm predictions within the Middle East.
Tomorrow.io for enhancing a variety of storm-scale predictions, including fire weather forecasts and wind gust forecasts that disrupt railway operations.

On this blog post, we show the performance optimizations and enhancements for CorrDiff training and inference that were incorporated into two tools within the Earth-2 stack, NVIDIA PhysicsNeMo and NVIDIA Earth2Studio. Achieving over 50x speedup on training and inference baselines, these optimizations enable:

Scaling patch-based training of your complete planet in under 3,000 GPU-hours.
Lowering most country-scale trainings to O(100) GPU-hrs.
Training over the contiguous United States (CONUS) in under 1000 GPU-hours.
Tremendous-tuning and bespoke training that democratizes km-scale AI-weather.
Country-scale inference in GPU-seconds, planetary-scale inference in GPU-minutes.
Generating large ensembles affordably for high-resolution probabilistic forecasting.
Interactive exploration of kilometer-scale data.

GPU	Version	Precision	Regression (ms/patch)	Diffusion (ms/patch)	Total runtime (ms/patch)	Throughput (patch/s)
H100	Baseline	FP32	1204.0	374.0	1578.0	0.63
H100	Optimized	BF16	10.609	51.25	61.859	16.2
B200	Optimized	BF16	4.734	24.56	29.297	34.1

Gen AI Super-Resolution Accelerates Weather Prediction with Scalable, Low-Compute Models

CorrDiff: Training and inference

Why optimize CorrDiff?

Accelerated CorrDiff

Optimized CorrDiff: The way it’s achieved

Key CorrDiff training optimizations

Training throughput

Speed-of-Light evaluation

CorrDiff inference optimizations

Getting began

What are your thoughts on this topic?
Let us know in the comments below.

Share this article

Recent posts

Learn how to Achieve 4x Faster Inference for Math Problem Solving

Google Cloud C4 Brings a 70% TCO improvement on GPT OSS with Intel and Hugging Face

ClickFix could be the biggest security threat your loved ones has never heard of

Learning Triton One Kernel at a Time: Softmax

Supercharge your OCR Pipelines with Open Models

Gen AI Super-Resolution Accelerates Weather Prediction with Scalable, Low-Compute Models

CorrDiff: Training and inference

Why optimize CorrDiff?

Accelerated CorrDiff

Optimized CorrDiff: The way it’s achieved

Key CorrDiff training optimizations

Training throughput

Speed-of-Light evaluation

CorrDiff inference optimizations

Getting began

What are your thoughts on this topic? Let us know in the comments below.

Share this article

Recent posts

What are your thoughts on this topic?
Let us know in the comments below.