From a user perspective, some video game enthusiasts have built their very own PCs equipped with high-performance GPUs just like the NVIDIA GeForce RTX 4090. Interestingly, this GPU can be able to handling small-scale deep-learning tasks. The RTX 4090 requires an influence supply of 450 W, with a really useful total power supply of 850 W (typically you don’t need that and is not going to run under full load). In case your task runs constantly for every week, that translates to 0.85 kW × 24 hours × 7 days = 142.8 kWh per week. In California, PG&E charges as high as 50 cents per kWh for residential customers, meaning you’d spend around $70 per week on electricity. Moreover, you’ll need a CPU and other components to work alongside your GPU, which is able to further increase the electricity consumption. This implies the general electricity cost might be even higher.
Now, your AI business goes to speed up. Based on the manufacturer, an H100 Tensor Core GPU has a maximum thermal design power (TDP) of around 700 Watts, depending on the precise version. That is the energy required to chill the GPU under a full working load. A reliable power supply unit for this high-performance deep-learning tool is often around 1600W. In case you use the NVIDIA DGX platform in your deep-learning tasks, a single DGX H100 system, equipped with 8 H100 GPUs, consumes roughly 10.2 kW. For even greater performance, an NVIDIA DGX SuperPOD can include anywhere from 24 to 128 NVIDIA DGX nodes. With 64 nodes, the system could conservatively eat about 652.8 kW. While your startup might aspire to buy this millions-dollar equipment, the prices for each the cluster and the needed facilities can be substantial. Usually, it makes more sense to rent GPU clusters from cloud computation providers. Specializing in energy costs, industrial and industrial users typically profit from lower electricity rates. In case your average cost is around 20 cents per kWh, operating 64 DGX nodes at 652.8 kW for twenty-four hours a day, 7 days every week would lead to 109.7 MWh per week. This might cost you roughly $21,934 per week.
Based on rough estimations, a typical family in California would spend around 150 kWh per week on electricity. Interestingly, that is roughly the identical cost you’d incur when you were to run a model training task at home using a high-performance GPU just like the RTX 4090.
From this table, we may observe that operating a SuperPOD with 64 nodes could eat as much energy in every week as a small community.
Training AI models
Now, let’s dive into some numbers related to modern AI models. OpenAI has never disclosed the precise variety of GPUs used to coach ChatGPT, but a rough estimate suggests it could involve 1000’s of GPUs running constantly for several weeks to months, depending on the discharge date of every ChatGPT model. The energy consumption for such a task would easily be on the megawatt scale, resulting in costs within the 1000’s scale of MWh.
Recently, Meta released LLaMA 3.1, described as their “most capable model thus far.” Based on Meta, that is their largest model yet, trained on over 16,000 H100 GPUs — the primary LLaMA model trained at this scale.
Let’s break down the numbers: LLaMA 2 was released in July 2023, so it’s reasonable to assume that LLaMA 3 took at the least a 12 months to coach. While it’s unlikely that each one GPUs were running 24/7, we are able to estimate energy consumption with a 50% utilization rate:
1.6 kW × 16,000 GPUs × 24 hours/day × 12 months/12 months × 50% ≈ 112,128 MWh
At an estimated cost of $0.20 per kWh, this translates to around $22.4 million in energy costs. This figure only accounts for the GPUs, excluding additional energy consumption related to data storage, networking, and other infrastructure.
Training modern large language models (LLMs) requires power consumption on a megawatt scale and represents a million-dollar investment. This is the reason modern AI development often excludes smaller players.
Operating AI models
Running AI models also incurs significant energy costs, as each inquiry and response requires computational power. Although the energy cost per interaction is small in comparison with training the model, the cumulative impact might be substantial, especially in case your AI business achieves large-scale success with billions of users interacting together with your advanced LLM day by day. Many insightful articles discuss this issue, including comparisons of energy costs amongst firms operating ChatBots. The conclusion is that, since each query could cost from 0.002 to 0.004 kWh, currently, popular firms would spend lots of to 1000’s of MWh per 12 months. And this number remains to be increasing.
Imagine for a moment that one billion people use a ChatBot incessantly, averaging around 100 queries per day. The energy cost for this usage might be estimated as follows:
0.002 kWh × 100 queries/day × 1e9 people × 12 months/12 months ≈ 7.3e7 MWh/12 months
This is able to require an 8000 MW power supply and will lead to an energy cost of roughly $14.6 billion annually, assuming an electricity rate of $0.20 per kWh.
The biggest power plant within the U.S. is the Grand Coulee Dam in Washington State, with a capability of 6,809 MW. The biggest solar farm within the U.S. is Solar Star in California, which has a capability of 579 MW. On this context, no single power plant is able to supplying all of the electricity required for a large-scale AI service. This becomes evident when considering the annual electricity generation statistics provided by EIA (Energy Information Administration),
The 73 billion kWh calculated above would account for about 1.8% of the overall electricity generated annually within the US. Nevertheless, it’s reasonable to consider that this figure might be much higher. Based on some media reports, when considering all energy consumption related to AI and data processing, the impact might be around 4% of the overall U.S. electricity generation.
Nevertheless, that is the present energy usage.
Today, Chatbots primarily generate text-based responses, but they’re increasingly capable of manufacturing two-dimensional images, “three-dimensional” videos, and other types of media. The subsequent generation of AI will extend far beyond easy Chatbots, which can provide high-resolution images for spherical screens (e.g. for Las Vegas Sphere), 3D modeling, and interactive robots able to performing complex tasks and executing deep logistical. In consequence, the energy demands for each model training and deployment are expected to extend dramatically, far exceeding current levels. Whether our existing power infrastructure can support such advancements stays an open query.
On the sustainability front, the carbon emissions from industries with high energy demands are significant. One approach to mitigating this impact involves using renewable energy sources to power energy-intensive facilities, reminiscent of data centers and computational hubs. A notable example is the collaboration between Fervo Energy and Google, where geothermal power is getting used to provide energy to an information center. Nevertheless, the dimensions of those initiatives stays relatively small in comparison with the general energy needs anticipated within the upcoming AI era. There remains to be much work to be done to deal with the challenges of sustainability on this context.
Please correct any numbers when you find them unreasonable.