Box CEO: Why Enterprises Keep Failing at AI

This is an AI-generated summary. The source video may include demos, visuals and additional context.
In Brief
Most large companies are still terrible at AI, and the standard explanation is that they are slow. Aaron Levie, CEO of Box, thinks that gets it wrong. On a recent a16z panel with board partner Steven Sinofsky and general partner Martin Casado, Levie laid out a different read: the technology fundamentally doesn't fit how large organizations work yet, and the gap will take years to close.
The conversation cuts through the hype on three fronts. Why top-down "more AI" mandates collapse. Why agents hit the same integration wall humans hit. And why the people predicting mass job loss are repeating a forecast that has been wrong for sixty years.
Related reading:
The Silicon Valley vs. enterprise gap
Levie spends his weeks visiting customers, and he describes his job as "bringing reality to the valley and the valley to reality". The gap is bigger than most people in tech realize.
A Silicon Valley engineer has near-ideal conditions for AI. High technical aptitude. Modern tools they pick themselves. The freedom to debug a broken pipeline and fix it. And the work itself is verifiable, which is exactly what models are best at. None of that holds in the rest of knowledge work. Workflows are different. Users are less technical. Data is fragmented. Systems are old. So when Silicon Valley promises that AI agents will run your company by next quarter, people in actual enterprises just look confused.
Casado adds an angle worth paying attention to. The deepest secular trends in technology, like the early internet, start with individuals and only later move into organizations. ChatGPT is already inside every big company through people who use it on their own. What is failing is the centralized, top-down deployment, not AI itself. The MIT figure that "95 percent of enterprise AI efforts fail" is misread, in his view, because it is measuring the corporate program and not the actual usage by employees.
The board, the consultant, and the doomed AI project
Sinofsky and Casado have both sat on enterprise boards, and they describe a now-familiar cycle. The board tells the CEO they need more AI. The CEO hires consultants. The consultants build a centralized project that no one in the operations side understands. It quietly fails. Then everyone blames AI.
Two things make this worse. The first is the token-counting incentive. Several large companies are now measuring employee productivity by how many AI tokens (the units of AI usage you pay for) they burn. The predictable result is engineers asking agents to do useless work just to inflate the metric. Casado quotes someone who works at one of these companies admitting it directly. You get whatever you measure.
The second is architectural paralysis. Three years ago, the answer to "how do you deploy agents" was completely different from today. Should the agent run in your cloud or theirs? In a browser or as a process? With which tools? Companies that picked early often got burned. Levie says he now sees CIOs stuck mid-debate between two or three frameworks, unwilling to commit, because the tech is moving faster than enterprise architecture cycles can absorb. Standing still feels safer than betting wrong.
Treat AI like a user, not like software
This is the architectural shift Casado wants people to internalize, and it might be the most useful idea in the entire conversation. Stop trying to integrate AI into your software. Treat the AI as a user of that software.
Six months ago, every product company was bolting AI features into the existing user interface, the chat-with-your-product pattern. That hybrid is collapsing. The new approach: take your product, expose it as a CLI (a text-only interface a program can drive), or as a set of clean APIs, and let the agent use it the same way a person would. The agent runs in a separate harness like Claude Code or Codex, and your product becomes something it consumes.
Why does this work better? Because LLMs (large language models) are non-deterministic and they handle the long tail of messy real-world cases. Those are properties of humans, not of software. Casado's point lands hard: we have spent forty years building access controls, processes, and design patterns to deal with messy humans. If you treat the agent as a new hire, give it an email address, a license, an identity, and the same access rights as a peer at its level, you get to draft on all that infrastructure. If you treat it like software, you fight it.
It is also why he is bearish on what some call the "SaaS apocalypse," the theory that agents will replace the seat licenses companies pay for. An agent is another seat. There is no way around that. It needs an identity. It can't share a human's credentials without breaking security. The pricing model may bend, but the seat doesn't go away.
The integration wall AI doesn't fix
Sinofsky pushes back on the optimistic part of this. "Treat the agent as a user" sounds clean, but real users hit walls all the time. Any company with more than a thousand employees, or older than ten years, is a mass of stuff sitting there waiting to be integrated, and AI does not actually help integrate anything.
He describes the human version. You call customer service. The agent on the line can't help you, so they bounce you to another human. The next person can't help either because that's a different department's system. You eventually find the person with the right access. An AI agent has no instinct for any of that. It will hit a permissions wall and get stuck, because nobody told it to call Sally in HR or Bob in finance.
This is also why the announcement that OpenAI is partnering with Accenture and Deloitte to roll out enterprise AI was, to Sinofsky, the most obvious news of the year. Big companies will need armies of system integrators just to wire AI into their existing data and processes. The "agents will replace the consultants" headline got it backwards. You need consultants to make agents work in the first place.
Salesforce goes headless: the bellwether moment
The biggest enterprise news Levie wanted to highlight: Salesforce announced it is going fully headless. A "headless" product has no user interface; it is meant to be driven by other software. Salesforce is openly conceding that the most important user of CRM data going forward is not a salesperson, it is an agent acting on behalf of one.
That decision opens a different scale of use. A human queries a CRM a few times a day. An agent fans out 500 parallel queries instantly to map every account before a meeting. Suddenly the bottleneck is not how fast the human types, it is how much the SaaS backend can handle. Levie warns that many SaaS products will collapse the first time they are hit at agent scale, the same way ERP APIs broke when business intelligence tools first started snapshotting them every night.
It also forces a new pricing question. Are agents seats? An API tax? A separate identity tier? Nobody has the answer yet. But as Salesforce goes, so does most of enterprise software. The race is on to expose every SaaS product as something an agent can drive directly, and the companies that get the architecture right early will have a structural advantage over the ones still building chat boxes.
AI coding makes systems more complex, not less
Levie is one of the loudest CEO voices on AI coding, and he says something his peers rarely admit on stage. Box gets a 2 to 3x productivity gain from AI coding, not 10x. The reason is not the model. The model writes 80 to 90 percent of new feature code at Box already. The bottleneck is everything around it: code review, security review, the deploy pipeline. You can ship faster, but only as fast as the constraints allow.
Casado pushes the point further. AI-written code degrades over time more aggressively than human-written code, and the industry has not figured out how to manage that yet. Vibe coding (the style where you let AI generate code with little manual review, named after the loose, intuitive feel of the workflow) works fine for one-off prototypes. It does not work for systems that have to keep running for ten years.
In Levie's words, the funniest concept in the discourse is that more code means fewer engineers. It is the opposite. Every new system is more complex than the one it replaced. More software means more upgrades, more downtime, more security incidents, more humans needed to keep the whole thing standing. Big companies survive precisely because they wrap engineers in constraints (reviews, audits, slow deploys) that AI accelerates but never removes.
The jobs misread: a sixty-year track record of being wrong
The closing argument is the most optimistic part of the panel, and it leans on history. Sinofsky pulls examples that map cleanly to today's predictions:
| Year | Prediction | What actually happened |
|---|---|---|
| 1965 | IBM's pitch: computers will replace accountants | Accounting expanded massively because companies could afford more analysis |
| 1981 | Time magazine cover: computers will automate paper out of offices | Paper consumption rose for two more decades; new categories of office work appeared |
| 1995 | "The End of Work" predicts mass joblessness from automation | The internet boom created millions of new categories of jobs |
| 2026 | AI will eliminate knowledge work | Same mistake, new tools |
Levie's argument is straightforward. A company is an information processor, and the binding constraint has never been how fast information gets created, only how effectively it gets consumed and acted on. AI accelerates creation. It does not relieve the consumption side, which still requires judgment, context, and someone willing to be accountable.
That is why the AI-native companies are hiring fastest, why infrastructure firms are growing not shrinking, and why engineering jobs will spread far beyond Silicon Valley. The next generation of engineers won't be at Google. They will be at John Deere automating tractors, at Caterpillar building AI for heavy equipment, and at Eli Lilly designing drugs.
What this is really about
The panel's diagnosis is consistent across all three speakers, even when they disagree on details. Enterprise AI is hard, but not because enterprises are slow or stupid. It is hard because:
- The technology is changing faster than architecture decisions can settle
- AI does not solve integration, which is where most of the real work has always been
- Productivity gains are real but always bounded by the review and trust processes that keep big companies from imploding
- Treating agents as users, not as software, is the unlock most companies haven't made yet
The diffusion will take years. In the meantime, the people getting AI to work in enterprises right now are individual employees, not centralized programs. That is the real signal, and it is the one most boardrooms keep missing.
Glossary
| Term | Definition |
|---|---|
| Agent | An AI that performs tasks on its own, opening apps, calling APIs, and acting across multiple steps, instead of just answering questions |
| LLM (large language model) | The kind of AI behind ChatGPT, Claude, and Gemini. Trained on huge text datasets, returns probabilistic answers |
| Non-deterministic | The same input can give different outputs, like a person reasoning through a problem, not a database lookup |
| Headless | A product with no user interface, meant to be driven by other software (or agents) through APIs |
| CLI (command line interface) | A text-only way to interact with software, easier for programs and agents to drive than a graphical interface |
| Token | The base unit of AI usage, billed per input and per output. Engineers competing on token volume is a recent enterprise antipattern |
| MCP (Model Context Protocol) | An emerging standard for letting agents call into a software product's data and actions |
| Vibe coding | A loose, AI-led coding style where the developer prompts an agent and accepts most of what it writes without deep review |
Sources and resources
- a16z: Box CEO: Why Big Companies Are Falling Behind on AI — The original 58-minute panel
- a16z — The venture firm hosting the conversation
- Box — Aaron Levie's company
- Salesforce — The company referenced for going headless
- Aaron Levie on X — Box CEO, posts regularly on enterprise AI
- Steven Sinofsky at a16z — Board partner, former Microsoft Office and Windows lead
- Martin Casado at a16z — General partner leading the firm's infrastructure practice
Want to go deeper? Watch the full video on YouTube →