What Is a GPU, and Why Does AI Run on One?

This is general information, not investment advice.

The hardware powering ChatGPT and every major AI model started life rendering polygons in video games. Here's how that happened.

A chip born for pixels

A GPU (graphics processing unit) is a chip designed to do the math that draws images — color, light and texture for millions of pixels at once. To do that, GPU makers packed thousands of small calculating cores onto one chip, all working simultaneously. Nvidia shipped its first modern GPU in 1999 for smoother games. Two decades later, that same design turned out to be exactly what AI needed.

CPU vs. GPU

Its counterpart is the CPU (central processing unit), the chip in every laptop. A CPU has a few very powerful cores optimized for sequential work — running an operating system, a browser, anything with complex, branching logic. A GPU trades that for throughput: thousands of weaker cores doing many identical calculations in parallel. When a problem splits into thousands of identical sub-tasks, a GPU finishes in a fraction of a CPU's time. Nvidia's data-center H100 chip has roughly 16,896 such cores; benchmarks show 40x-plus speedups over CPUs on the math AI relies on, as explained by infrastructure providers like Civo.

Why AI is, underneath, matrix math

Modern AI — the neural networks behind chatbots and image generators — reduces to one operation done over and over: matrix multiplication (multiplying big grids of numbers). Training a model runs this forward and backward across billions of examples to tune billions of "weights." That work is embarrassingly parallel — each piece can be computed at the same time — which maps perfectly onto a GPU's thousands of cores. That single architectural fit is why GPUs, not CPUs, became the workhorse of AI.

CUDA: the real moat

Nvidia's hardware lead is real, but its deeper edge is software. In 2007 it released CUDA, a toolkit that let developers program its GPUs in familiar languages. Over nearly two decades, the entire AI software stack — including the PyTorch and TensorFlow frameworks researchers use daily — grew up on CUDA. Switching to a rival's chip means rewriting thousands of small engineering decisions, so the lock-in compounds quietly, analysts note. It's why competitors with comparable hardware still struggle to win share.

Nvidia's grip — and the price

The result is dominance: estimates put Nvidia's share of the AI-accelerator market around 80%+. A single top AI GPU (the H100) sells for roughly $25,000–$40,000, and AI data centers pack tens of thousands of them — which is what makes the AI build-out so capital-intensive (Microsoft, Google, Amazon and Meta together guided to $300 billion-plus of 2025 capex, the spending now under market scrutiny in the AI selloff).

The supply chain — and the competition

Nvidia designs GPUs but doesn't make them: that's TSMC in Taiwan (tying GPUs to the chip-concentration risk we've covered), and they need the high-bandwidth memory from SK Hynix, Samsung and Micron that we've tracked through the memory squeeze. Rivals are pushing hard: AMD's Instinct line, and custom chips from the cloud giants — Google's TPUs, Amazon's Trainium, Microsoft's Maia — designed to be cheaper and more efficient for specific tasks. Custom chips are growing fast, but none has yet dented Nvidia's lead in general-purpose AI training, where CUDA still rules.

The power bill

GPUs are power-hungry. At scale, an AI data center can draw hundreds of megawatts — enough for a small city — and global data-center electricity use rose about 17% in 2025, per the IEA, driven largely by AI. For many new sites, the binding constraint isn't buying GPUs but getting a grid connection — the same tension behind our energy-and-data-center coverage.

What it means

A chip designed to shade polygons turned out, by accident of architecture, to be the right tool for the matrix math at the heart of machine intelligence. Nvidia's head start in silicon — and especially software — made it the default infrastructure of the AI economy. Rivals and the cloud giants are spending to break that grip, but for now nearly every major AI model in the world runs on a chip descended from a late-1990s graphics card.