Scientists from NVIDIA, in collaboration with Lawrence Berkeley National Laboratory (Berkeley Lab), released a machine learning tool called Huge Ensembles (HENS) for extreme-weather prediction that brings supercomputer-class forecasting but at significantly less computational power and value. Available as open source code or ready-to-run model, it forecasts low-likelihood, high-impact events—from prolonged heat waves to 100-year hurricanes. The technology could help climate scientists, city officials, and emergency managers quickly test scenarios and update response plans with minimal computing resources.
The 2-part study published within the journal Geoscientific Model Development, introduces a technique called HENS to provide 27,000 years of knowledge and is certainly one of the biggest and most reliable ensembles of weather and climate simulations available.
Using NVIDIA PhysicsNeMo, an open source Python framework for constructing, training, and fine-tuning physics AI models at scale, and Makani open source frameworks, the researchers trained global weather models to refine the HENS methodology.
“Twenty-seven thousand years of simulations is a goldmine for studying the statistics and drivers of utmost weather events,” said Ankur Mahesh, co-author on the study and a graduate student researcher in Berkeley Lab’s Earth and Environmental Sciences Area. “This massive sample size is actually at a scale that has not been seen before.”
In line with the study, HENS can predict weather faster than other methods, taking minutes as an alternative of hours. It also extends the forecast window, predicting extreme weather events from six hours to 14 days into the longer term at a resolution of 15 miles (25 kilometers). It might probably help researchers study weather patterns at high resolution over many a long time to discover recent clues leading as much as an extreme event.
“With HENS, we now have the posh of going after low-likelihood, high-impact extreme events predicted over years and a long time as an alternative of single near-term events,” said senior co-author Bill Collins, a school senior scientist in Berkeley Lab’s Earth and Environmental Sciences Area and a professor at UC Berkeley.
This recent approach also requires far less energy and folks hours than other methods, and saves energy by retraining models on recent data—a way to make sure accuracy—more quickly than other methods, Collins added.
Training HENS: PhysicsNeMo and 40 years of climate data
HENS employs an AI model trained using PhysicsNeMo on 40 years of ERA5 data, among the finest historical atmospheric state sources. Once trained, the model offers a far cheaper computational approach for forecasts, said Shashank Subramanian, a machine learning engineer within the Department of Energy’s National Energy Research Scientific Computing Center (NERSC) at Berkeley Lab and study co-author who helped Mahesh develop and test the training and evaluation workflows.
“HENS is a game changer. Until today, generating 1,000- or 10,000-member ensembles of simulations was simply impractical due to prohibitive compute and data storage costs,” said co-author Michael Pritchard, director of climate simulation research at NVIDIA and a professor at UC Irvine. “Due to this team’s careful work calibrating novel AI simulation technology, it’s now fit for purpose to generate massive ensembles including realistic heat wave counterfactuals at orders-of-magnitude faster completion than traditional numerical simulations.”
How will you improve weather prediction accuracy using HENS?
To capture the range of possible future weather outcomes, national weather services run multiple different simulations, or “ensemble members,” each with small changes to the initial conditions. These numerical models are based on laws of physics comparable to the conservation of mass, the conservation of momentum, and the conservation of energy. There may be loads of trust in these physics-based simulations, but also they are very computationally expensive because they require a supercomputer.
Attributable to this expense, traditional weather models can only have 50 ensemble members. To seek out extreme weather, the initial conditions of a model have to be perturbed hundreds of times and require lots of of supercomputing hours.
The researchers used HENS to create 7,424 ensemble members based on initial weather conditions from every day of summer 2023, the most well liked on record on the time—nearly 150x more members than what’s possible with conventional models—each ensemble member represents an alternate weather trajectory, or a distinct way the weather could have unfolded.
“This allowed us to get a greater estimate of the tail of the distribution and to grasp extreme events that might have occurred that summer,” Mahesh said.
The predictions made by HENS have uncertainties which might be over 10 times smaller than those from traditional models. It’s in a position to catch 96% of rare but severe extreme weather events that other models normally miss. Together, these strengths have allowed the team to create an unlimited dataset, about 27,000 years’ value of climate data (20 petabytes).
During rigorous validation experiments at NERSC, Mahesh and team weighed the ensemble predictions on a big selection of diagnostic metrics, showing that HENS may be very near the gold standard.
What’s next?
In future work, Mahesh said that the team plans to review the 27,000-year simulations with the hope of uncovering recent insight into the drivers behind the low-likelihood high-impact events, comparable to catastrophic heat waves, hurricanes, and atmospheric rivers, which have devastated communities lately. Additionally they aim to further reduce the computational requirements for running HENS.
NERSC is a DOE Office of Science user facility at Berkeley Lab. This work was supported by the DOE Office of Science.
