Jensen Huang: AGI Is Here and Intelligence Is a Commodity

Key insights
- Jensen calls AGI already achieved, but qualifies it sharply: a single AI agent could build a billion-dollar company today, but 100,000 agents building NVIDIA has zero percent odds. The debate is about definition, not capability.
- The warehouse-to-factory shift is Jensen's core argument for why the computing industry will be orders of magnitude larger. Computers no longer store value, they generate it. Every token produced is a unit of revenue.
- CUDA's real moat is not the technology itself but 20 years of install base, velocity, and trust. Building a technically better alternative would not be enough to displace it.
- Jensen separates intelligence from humanity. He calls intelligence a commodity already, and argues that character, determination, and compassion are the human qualities AI cannot replicate and should be the words we elevate.
This is an AI-generated summary. The source video may include demos, visuals and additional context.
In Brief
In a 2.5-hour conversation on the Lex Fridman Podcast, Jensen Huang, co-founder and Chief Executive Officer (CEO) of NVIDIA, covered the full arc of how the world's most valuable company thinks about AI, compute, and the future of human work. He declared that AGI (Artificial General Intelligence, the point where AI can perform any intellectual task a human can) has already been achieved by one reasonable definition. He outlined four scaling laws that make compute the single bottleneck for intelligence. He explained why he now thinks of NVIDIA not as a chip company but as a factory that generates tokens, and he made a sharp distinction between intelligence and humanity, calling intelligence a commodity and arguing that human character is the real superpower.
Related reading:
AGI: already here, with a catch
The conversation started with a deceptively simple question about timelines. Lex Fridman asked when an AI system could start, grow, and run a successful technology company worth a billion dollars. Jensen answered without hesitation: "It's now. We've achieved AGI." No caveats, no timeline ranges.
But Jensen immediately put a hard limit on that claim. He described a scenario where an AI agent creates a web service that a few billion people use briefly, makes money, and then disappears. Several companies during the internet boom followed exactly that arc, he noted, and most of those websites were not more sophisticated than what an AI agent could build today. So a single agent achieving a billion-dollar moment? Possible. Probable, even, for some unknown application.
But 100,000 agents collectively building NVIDIA? "Zero percent." Building a company like NVIDIA requires not just intelligence but 34 years of accumulated decisions, trusted relationships, organizational culture, and the kind of judgment that only comes from surviving repeated crises. Jensen is the longest-running tech CEO in the world, and he was not being modest: he was drawing a real line between what AI can and cannot do.
The lesson is that the AGI debate is mostly about definitions. If AGI means completing a defined task brilliantly, it has arrived. If it means replacing complex, long-term, context-dependent human institutions, we are far away.
Four scaling laws and one conclusion
Jensen has been one of the most consistent voices arguing that AI will keep getting better as long as you keep throwing compute at it. At a time when some researchers worried that scaling was hitting a wall, Jensen described four distinct axes along which AI continues to improve — and each one demands more compute.
Pre-training is what most people picture: feed a model more data and it learns more. Critics said this would hit a ceiling when high-quality human data ran out. Jensen's response is that the data problem has been solved by making AI generate its own training data synthetically. Pre-training scaling continues.
Post-training is the refinement phase: taking a pre-trained model and making it better at specific tasks through techniques like reinforcement learning. As Jensen put it, you take ground truth, augment it, and synthetically generate more data. This also continues to scale.
Test-time compute is the newer insight that gave rise to reasoning models like o3 and DeepSeek R1. Instead of answering instantly, the AI spends extra processing time thinking through the problem before responding. Some researchers dismissed this phase as lightweight. Jensen disagreed: "inference is thinking, and thinking is hard." Thinking is more compute-intensive than reading, not less. Test-time scaling turns out to be intensely demanding.
Agentic scaling is the fourth law, and the newest. A single AI agent can spawn sub-agents to work in parallel, the same way a manager hires a team instead of doing everything alone. Jensen compared it to scaling NVIDIA by hiring more employees, which is far more effective than trying to clone himself. Each sub-agent requires its own compute, multiplying the total demand.
The four laws form a continuous loop: agents generate new experiences, the best ones feed back into pre-training, post-training refines the result, test-time thinking improves it further, and agents put it back into the world. "Intelligence is gonna scale by one thing, and that's compute."
From warehouses to token factories
One of Jensen's most striking framings is what he calls the shift from retrieval-based to generative-based computing. For most of computing history, a computer was essentially a very fast filing cabinet. You pre-wrote something, saved it, and retrieved it later. Almost everything was a file.
"We went from a retrieval-based computing system to a generative-based computing system." The new kind of computer does not look up an answer. It generates one in real time, processing and producing tokens (the basic units of text an AI works with, roughly a word or part of a word) with every request.
This changes the economics of computing entirely. A warehouse stores value. A factory creates it. Under the old model, the question was how much data you could hold. Under the new model, the question is how many tokens you can produce per watt per dollar. Every token is a unit of output, and the right tokens are worth money.
Jensen pointed to premium tokens as evidence. High-intelligence AI outputs, the kind used for specialized tasks in medicine, law, or advanced research, command a price. "$1000 per million tokens is just around the corner. It's not if, it's only when." When Lex asked whether NVIDIA could be worth $10 trillion, Jensen framed the answer around this logic: the world's computing infrastructure is being rebuilt from the ground up as a revenue-generating machine. The market that creates is not a larger slice of an existing pie — it is a new kind of pie altogether.
NVIDIA now has a single Vera Rubin rack with 1.3 million components sourced from 200 suppliers. Jensen's mental model of what NVIDIA builds has changed accordingly: he no longer pictures a chip when he thinks about a product launch. He pictures a gigawatt factory connected to the power grid, with thousands of engineers bringing it online.
CUDA: three reasons no one can copy it
NVIDIA is the most valuable company in the world, and the natural question is what stops a competitor from building something better. Jensen's answer is specific: CUDA, the programming platform that lets software run on NVIDIA's graphics chips (GPUs), has a moat built from three things that are very hard to replicate.
First, install base. NVIDIA made a bet two decades ago that cost enormous amounts of the company's gross profit: they put CUDA on every GeForce gaming card, whether or not the buyer needed it. The goal was not immediate return but cultivating an install base. They went to universities, wrote books, taught classes, and put CUDA in the hands of researchers who would eventually find things to do with it. Today, developers write CUDA first because it reaches hundreds of millions of computers in every cloud, every industry, and every country.
Second, velocity. NVIDIA ships a new generation of hardware on roughly a one-year cycle. From a developer's perspective, that means if you invest in supporting CUDA today, it will be ten times better in six months. No alternative platform offers that combination of reach and improvement speed. Jensen noted that 43,000 employees are focused on making NVIDIA hardware faster, more efficient, and more capable every year.
Third, trust. Jensen articulated this simply: developers trust that NVIDIA will maintain and improve CUDA for as long as the company exists. You can take that to the bank, he said. That trust, built over 20 years, is not something a competitor can replicate with a better technical specification.
A technically superior alternative to CUDA would not be enough. The install base, the velocity of improvement, and the accumulated trust are the real barriers. As Jensen put it: install base is not just important, it is the single most important property NVIDIA has today.
Intelligence vs. humanity
The most philosophically direct section of the conversation came near the end, when Jensen drew a line between two words people often treat as interchangeable.
Jensen does not romanticize intelligence. He is surrounded by people more intelligent than himself in each area they work in, and he told Lex so without embarrassment. What gives him a role in that circle is not intelligence but something else. "Intelligence is a commodity," he said. It is functional, it is scalable, and it will become cheaper as AI continues to improve.
What cannot be commoditized, in his view, is humanity. Determination. Tolerance for pain. The willingness to pursue a goal when evidence says it will not work. The ability to enter a situation fresh-minded and forget the setbacks. These are the qualities that define human contribution in a world where intelligence is abundant.
Lex pushed further: if intelligence is everywhere, what word should we elevate instead? "The word we should really elevate is humanity," Jensen agreed. Character, compassion, generosity. All of those things.
For someone running the company that more than anyone else has made AI intelligence cheap and available, this is a striking position. Jensen is not arguing against AI. He is arguing that the human qualities that matter most are exactly the ones that are hardest to fake — and that the proliferation of AI should make those qualities more valuable, not less.
The future of coding, and a closing note on anxiety
Jensen closed with a prediction about programming that captures his broader view of how AI changes work. Today, roughly 30 million people know how to write code. Writing code means giving a computer exact, formal instructions. AI has changed this: you can now describe what you want in plain language, and the system figures out the formal instructions.
"We went from 30 million to probably 1 billion" people who can now direct a computer to build things. Every carpenter, Jensen said, can now also be an architect. Their value to customers has gone up, not down.
The same logic applies across jobs. The tools change. The purpose of the job stays the same. Jensen has been doing his job for 34 years, and the tools he uses have changed dramatically, sometimes in the span of two or three years. He did not lose his job. The job evolved.
For people anxious about AI taking their work, Jensen's message was practical: the job and the tasks are related but not the same thing. Radiology was the first profession AI researchers predicted would disappear. Instead, radiologists gained a tool that exceeded human visual accuracy on certain diagnoses, and the profession adapted. The pattern will repeat.
Glossary
| Term | Definition |
|---|---|
| AGI (Artificial General Intelligence) | The point where an AI system can perform any intellectual task a human can. There is no agreed definition, which is why the debate continues. |
| CUDA | NVIDIA's programming platform that lets software run on their GPUs. Think of it as the language that connects software to NVIDIA hardware. |
| Scaling laws | The observed pattern that AI gets smarter when you give it more data, compute, or time to think. |
| Test-time compute | When an AI spends extra processing time thinking through a problem before answering, instead of responding instantly. Used in reasoning models like o3. |
| Agentic scaling | Making AI smarter by having it spin off sub-agents that work in parallel, like a manager building a team. |
| Token | The basic unit of text that AI processes, roughly a word or part of a word. AI models generate tokens one at a time. |
| Install base | The total number of devices and users already running a particular technology. The bigger it is, the harder it is to displace. |
| HBM (High Bandwidth Memory) | Specialized fast memory stacked in layers, designed for AI chips that need to move massive amounts of data quickly. |
Sources and resources
Want to go deeper? Watch the full video on YouTube →