Claude Opus 4.8 and dynamic workflows: AI as a work crew

In Brief

Anthropic has launched Claude Opus 4.8, a new version of its most advanced Claude model. At the same time, they're launching dynamic workflows in Claude Code, a feature that lets Claude break large programming tasks into many smaller ones and run them with many parallel subagents.

Put simply:

Claude Opus 4.8 is a smarter and more careful model. Dynamic workflows is a new way of working where Claude Code doesn't just answer once, but plans, distributes work, checks the results, and pulls everything together into a single coordinated deliverable.

This points to an important shift: AI tools become less like a single assistant chatting back to you, and more like a small digital team that can work systematically over time.

Anthropic says Opus 4.8 is available from May 28, 2026, at the same standard pricing as Opus 4.7: 5 dollars per million input tokens and 25 dollars per million output tokens. Fast mode is priced at 10 dollars per million input tokens and 50 dollars per million output tokens.

Why this matters

This matters because AI tools are now moving into work that used to require a combination of developers, architects, testers, and project managers.

For ordinary users, it means Claude can get better at saying things like: "I don't know this for sure." "This should be double-checked." "I found a mistake in my own work."

For developers, it means Claude Code can become more useful on large, messy codebases. Not just small code suggestions, but bigger jobs like:

finding bugs across an entire repo
migrating from one framework to another
checking security patterns across many files
modernizing legacy code
running several independent attempts and comparing the results

For businesses, it means the AI agent starts to look more like a workflow engine than a chatbot. The important question is no longer just "can the model write code?", but:

Can the model plan, delegate, check, and deliver reliable work?

What is Claude Opus 4.8?

Before this chapter

To understand Opus 4.8, we first need to separate three things:

The model: the AI system itself, the part that thinks and writes.
The product: Claude in the browser, Claude Code, the Claude API, and other places the model is used.
The way of working: how the model plans, uses tools, and solves tasks.

Claude Opus 4.8 is first and foremost a new model version, but the launch is also about new ways to use the model.

Words you'll meet

Claude: Anthropic's AI assistant and model family.
Opus: Anthropic's most advanced Claude model line.
Benchmark: a test used to compare models.
Agentic task: a task where the model doesn't just answer, but plans and carries out multiple steps.
Token: a small piece of text. AI models read and write tokens, not "words" the way humans do.

Main explanation

Anthropic describes Claude Opus 4.8 as an upgrade from Opus 4.7. It's supposed to be better at coding, agentic capabilities, reasoning, and practical knowledge work. But the most interesting thing isn't just that it scores higher on tests. It's that Anthropic highlights honesty as one of the most important improvements.

When people talk about AI, they often focus on intelligence: How hard a math problem can it solve? How good can the code be? How fast can it answer?

But in practical work, another question is at least as important:

Does the model know when it doesn't know?

A model that always sounds confident, but gets it wrong, is dangerous in a work setting. It can give you false security. A model that says "I'm not sure here" or "this should be tested" can actually be more useful, even if the answer feels less impressive.

Anthropic says Opus 4.8 lets about four times fewer mistakes in its own code slip through without comment compared to its predecessor. The Verge also highlights this as a central part of the launch: the model is supposed to admit uncertainty more often and make fewer claims it can't support.

Explained simply

Think of two assistants.

The first one always says: "Done! Everything works."

The second one says: "It looks like this works, but I found two places that should be tested more carefully."

In real work, the second one is often more valuable.

Claude Opus 4.8 is trying to be more like the second assistant.

A bit more technical

In AI terms, this is about a model's ability to calibrate its own answers. A well-calibrated model shouldn't just generate a plausible answer, but also assess how strongly that answer is supported by the evidence it has available.

For coding, this means things like:

noticing that a test doesn't actually cover the bug
speaking up about edge cases
flagging assumptions
distinguishing between "I implemented this" and "I implemented this, but didn't run the tests"
avoiding presenting half-finished work as finished

This matters because AI agents often work across many steps. The more steps they take, the bigger the risk that small mistakes compound into large ones.

What this means in practice

For a developer, Opus 4.8 can be useful when you want a model that doesn't just write code, but also assesses the quality of the work.

For a manager, this may mean AI tools become a bit more suitable for serious workflows, because the model isn't just supposed to produce more, but also to check more carefully.

For an ordinary user, it means answers can become more nuanced. Not always shorter or more confident, but hopefully more honest.

Quick summary

Claude Opus 4.8 is a new model version from Anthropic.
It builds on Opus 4.7.
Anthropic highlights better coding, reasoning, and agentic capabilities.
The most important improvement may be better honesty and uncertainty marking.
In practice, this makes the model more useful in work where mistakes have a cost.

What is effort control?

Before this chapter

A common misconception about AI is that a model always "thinks the same amount". It doesn't. Modern AI systems can use more or less computation, time, and tokens depending on the task and the settings.

Anthropic is now launching effort control, a way to control how much effort Claude puts into a response.

Words you'll meet

Effort: how much work the model puts into the answer.
High effort: the model uses more time and more tokens for better quality.
Low effort: the model answers faster and uses less of the rate limit.
Rate limit: a cap on how much you can use an AI system within a given period.
xhigh / max: higher effort levels for more demanding work.

Main explanation

Anthropic says users on claude.ai can now control how much effort Claude uses on a task. Higher effort means Claude can think more often and more deeply, while lower effort gives faster answers and uses less of your rate limit.

You don't use the same mental effort to answer "what time is it?" as you do to write a business strategy. In the same way, it makes sense that an AI model shouldn't always run at maximum effort.

Explained simply

Effort control is like choosing a gear on a car.

Low gear: quick and easy for small tasks.
High gear: more power for steep climbs.
Max effort: when the task is genuinely hard.

You don't need a Formula 1 engine to fetch milk. But you also wouldn't try to tow a truck with a scooter.

A bit more technical

When Claude uses higher effort, the model can use more tokens internally and externally to work through the problem. Anthropic says Opus 4.8 uses high effort by default, because it gives the best balance between quality and user experience. On coding tasks, this should use roughly the same number of tokens as Opus 4.7's default level, but with better performance.

In Claude Code, there are also levels like extra, xhigh, and max, depending on context. Anthropic recommends "extra" for difficult tasks and long-running asynchronous workflows.

What this means in practice

This gives the user more control over cost, speed, and quality.

Examples:

You want a quick explanation of a concept: use lower effort.
You want code review of a critical function: use higher effort.
You want to migrate a large codebase: use dynamic workflows or higher Claude Code effort.
You want to save rate limit: pick lower effort when the task is simple.

This is a sign that AI products are becoming more like professional tools. Instead of one magic button, you get controls that let you pick the right working mode.

Quick summary

Effort control lets you tune how hard Claude works.
Higher effort can give better answers, but uses more tokens.
Lower effort gives faster and cheaper answers.
Opus 4.8 uses high effort by default.
This makes Claude more flexible for both small and large tasks.

What are dynamic workflows in Claude Code?

Before this chapter

This isn't just about a better model. It's about a new way of organizing AI work.

Words you'll meet

Claude Code: Anthropic's developer tool for using Claude directly in coding work.
Workflow: a series of steps for solving a task.
Dynamic workflow: a workflow Claude plans and adjusts on its own as it goes.
Subagent: a smaller agent that takes on a sub-task.
Parallel execution: several tasks run at the same time.
Verification: checking results before they're presented.

Main explanation

Dynamic workflows let Claude Code take large tasks and split them into many smaller parts. Claude can write its own orchestration scripts, spin up tens or hundreds of parallel subagents in a single session, check the work, and gather the result before the user sees an answer.

This is a big shift.

The old chatbot model is simple: the user asks a question, the AI generates one answer, the user evaluates it.

Dynamic workflows are something different: the user gives a large task, Claude makes a plan, splits the task into smaller parts, runs many subagents in parallel, checks the work, and pulls everything together into one coordinated result.

Think of it as the difference between one person trying to renovate a whole house alone, and a project manager who distributes the work between an electrician, a plumber, a painter, and an inspector.

Claude isn't just "the person doing the job" anymore. Claude is also the coordinator.

Explained simply

Dynamic workflows means:

Claude makes a plan, sends parts of the job to several smaller AI agents, checks their answers, and puts it all back together.

It's like handing one project manager a big job, and the project manager assembles a temporary team to get it done.

A bit more technical

When a dynamic workflow starts, Claude does roughly this:

Interprets the user's goal.
Builds a plan.
Splits the work into sub-tasks.
Runs subagents in parallel.
Has agents investigate the same problem from different angles.
Has other agents try to disprove or stress-test the findings.
Iterates until the answers converge.
Returns a single coordinated result.

Anthropic says this is especially well-suited to parallel, long-running tasks that may last hours or days. Progress is saved along the way, so an interrupted job can pick up where it left off instead of starting from scratch.

What this means in practice

Dynamic workflows can be used for things like:

bug hunts across large codebases
security reviews
profiling and optimization analysis
migrations between frameworks or languages
modernizing legacy code
reviewing critical plans
independent verification before code is shipped

Anthropic says dynamic workflows are available in research preview in the Claude Code CLI, Desktop, and VS Code extension for Max, Team, and Enterprise plans, when enabled by an admin. They're also available via the Claude API, Amazon Bedrock, Vertex AI, and Microsoft Foundry.

Quick summary

Dynamic workflows is a new way of working in Claude Code.
Claude can split large tasks into many smaller ones.
Many subagents can work in parallel.
The results are checked before they're delivered.
This is best suited to large, complex, long-running coding work.

The Bun example: why everyone's talking about it

Before this chapter

Anthropic uses a powerful example in the blog post: a rewrite of Bun from Zig to Rust.

This isn't a small demo. It's an example of the kind of task that would normally be enormously demanding.

Words you'll meet

Bun: a modern JavaScript runtime and toolchain ecosystem.
Zig: a programming language focused on control, performance, and easy compilation.
Rust: a programming language known for memory safety and performance.
Porting: moving software from one language or system to another.
Test suite: a collection of tests that check whether software works.

Main explanation

Anthropic writes that Jarred Sumner used dynamic workflows to port Bun from Zig to Rust. According to the blog post, the work ended up at roughly 750,000 lines of Rust, 99.8 percent of the existing test suite passing, and eleven days from first commit to merge. Anthropic stresses that this isn't in production yet, but uses the example to show what dynamic workflows can open up at scale.

This matters because language porting is hard.

It isn't just about translating syntax. It's about preserving behavior.

A simple analogy:

Porting a codebase isn't like translating a sentence from Norwegian to English. It's more like moving an entire factory from one country to another, with new machines, new power outlets, and new safety rules, while the product that comes out of the factory should be identical.

Explained simply

The Bun example shows that Claude Code with dynamic workflows can be used on extremely large coding jobs, not just small suggestions inside a single file.

A bit more technical

Anthropic describes several workflows in the Bun work:

one workflow mapped out the correct Rust lifetimes for struct fields in the Zig codebase
another wrote .rs files that were supposed to be behaviorally identical to the .zig files
hundreds of agents ran in parallel
two review agents checked every file
a fix loop ran build and tests until things worked
a later workflow found unnecessary data copies and opened PRs for review

This is interesting because it combines several AI patterns in one flow: analyzing the old code, planning the port, generating new code in parallel, review, build and test, a fix-loop that runs until things work, and a final human PR review.

The important point isn't that AI "replaces developers". It's that AI can handle the grunt work, the mapping, the systematic checking, and the repetitive changes at a scale that used to be extremely expensive.

What this means in practice

For developers, this means large technical-debt projects can become more realistic.

Many companies have these tasks sitting around:

"We should upgrade the framework."
"We should remove legacy code."
"We should swap out the API."
"We should modernize the tests."
"We should review security patterns across the whole repo."

The problem is that these tasks are often too large, too boring, and too risky. Dynamic workflows try to make tasks like these more manageable.

But: this doesn't mean you should merge AI-generated code without thinking. Anthropic itself says Claude checks the work before reporting back, but in professional development you still need tests, code review, CI/CD, and humans with accountability.

Quick summary

The Bun example shows dynamic workflows at scale.
The job was porting from Zig to Rust.
Anthropic reports 99.8 percent test-pass and about 750,000 lines of Rust.
The example is impressive, but not the same as a finished production guarantee.
Human review and tests are still essential.

What's the difference between one agent and many subagents?

Before this chapter

This is where many readers lose track, so we'll keep it simple.

An AI agent isn't magic. It's a system that can use the model, tools, files, and instructions to work toward a goal. A subagent is a smaller agent that takes on a sub-task.

Words you'll meet

Agent: an AI-driven process that can take multiple steps.
Subagent: an agent responsible for a defined slice of the task.
Orchestration: coordinating many work processes.
Adversarial checking: one agent trying to find mistakes in another agent's work.

Main explanation

One agent can be good at a lot, but it has limits:

It can miss things.
It can get stuck in one line of thought.
It can lose track in large codebases.
It can make an early mistake and then build on it.

Many subagents can reduce some of these problems because they can work independently. One agent can hunt for security issues. Another can check test coverage. A third can try to disprove the findings.

It's a bit like an editorial meeting:

one journalist writes the story
one fact-checker verifies the numbers
one editor checks the structure
one lawyer considers the risk
one proofreader catches small errors

The story comes out better because several roles attack the problem from different angles.

Explained simply

One agent is one smart assistant. Many subagents are like a small team of specialized assistants.

A bit more technical

Dynamic workflows uses parallelization. That means many sub-tasks can run at the same time. This is especially useful when the task naturally splits up, for example:

check every folder in a repo
analyze each module separately
port each file
evaluate each security rule
run several alternative solutions
have review agents check the output from implementation agents

The technical point is that large problems often don't just need "more intelligence". They need better organization.

What this means in practice

This makes AI more relevant for large work.

But it has a cost: more agents means more token usage. Anthropic warns explicitly that dynamic workflows can use significantly more resources than a regular Claude Code session, and recommends starting with a bounded task to understand the consumption.

Don't use dynamic workflows for everything. Use it when the task really is large enough to deserve a whole team.

Quick summary

Subagents let Claude split the work into smaller parts.
Parallel work can give better coverage of large problems.
Review and verification agents can catch mistakes.
This is best suited to large codebases and complex tasks.
The cost can go up, so scope matters.

What does this mean for product development and businesses?

Before this chapter

Now we're shifting our focus from the technology to working life.

This isn't just about Claude. It's about how product development can change once AI can do more than write standalone answers.

Main explanation

Many companies have a long list of technical debt sitting around. This is work everyone knows should be done, but that rarely gets prioritized because it takes too long.

Examples:

cleaning up old code
upgrading dependencies
swapping out outdated APIs
writing missing tests
documenting systems
finding duplicated code
improving performance
standardizing patterns across repositories

Dynamic workflows can make tasks like these cheaper to start. Not necessarily risk-free, but more doable.

This can change product development in three ways.

1. From "we should" to "let's try"

A major migration used to be a quarterly project. Now the team can maybe ask Claude to do an analysis, a proposal, a test run, or a proof of concept.

2. From manual searching to systematic review

AI agents can comb through large codebases and build overviews. That means humans can spend more time on judgment and decisions.

3. From single prompt to workflow

The most important maturity may be mental: you stop thinking "what should I ask AI?" and start thinking "what workflow do I need?"

Explained simply

This moves AI from being a writing assistant to being a work assistant.

Not just: "Write this function."

But: "Investigate the whole codebase, find patterns, suggest changes, test them, and give me a report."

A bit more technical

For organizations, the key questions become:

How do we limit the agent's access?
Which repos can it read?
Can it write code directly?
Does it have to open pull requests?
Which tests have to pass?
Who approves changes?
How do we log decisions?
How do we handle cost and token usage?

This means AI adoption isn't just a model choice. It's also a question of governance, security, developer flow, and accountability.

What this means in practice

Companies shouldn't start with "let AI do everything". They should start with bounded workflows:

Find dead code in one repo.
Suggest test improvements.
Check one type of security pattern.
Modernize one module.
Make a report on technical debt.

Once the team trusts the process, they can expand the scope.

Quick summary

Dynamic workflows can make technical debt more manageable.
AI becomes more useful as a workflow, not just a chatbot.
Businesses need to think about access, testing, cost, and governance.
Start with bounded tasks.
Humans still need to own the decisions.

What should we be critical of?

Before this chapter

New AI launches often come with big claims. It's easy to be impressed. But good tech literacy also requires critical reading.

1. Research preview doesn't mean finished maturity

Dynamic workflows is available as a research preview. That means the feature is early and still under development. It can be powerful, but also have unpredictable sides.

2. Token usage can get high

When hundreds of subagents work in parallel, cost can climb quickly. Anthropic itself warns that dynamic workflows can use much more than a typical Claude Code session.

3. Verification isn't the same as truth

Claude checking its own work is good. But AI verification can still miss things. Independent tests, human review, and production monitoring are still necessary.

4. Big benchmark improvements should be read carefully

Benchmarks are useful, but they aren't reality. A model can do well on a test and still fail on your codebase, your requirements, or your edge cases.

5. More autonomy demands more responsibility

The more AI can do on its own, the more important control mechanisms become. That's especially true when the agent can change code, run commands, or affect production-adjacent systems.

Explained simply

Dynamic workflows is powerful, but you should treat it like an extremely capable junior team: fast, hard-working, and useful, but not infallible.

A bit more technical

The risk of AI agents grows with:

more steps
more tool access
larger codebases
weaker tests
unclear goals
missing sandboxing
automatic merges without review

That's why good workflows need clear checkpoints: AI makes a proposal, automated tests must pass (if not, it's fixed and tested again), then human code review (if not approved, it goes back for revision), then merge, and finally monitoring after deploy.

Quick summary

Research preview means the feature should be used carefully.
Costs can climb quickly.
AI verification doesn't replace human accountability.
Benchmarks are useful, but not the final word.
Large agentic systems need solid guardrails.

Closing

The most important thing to understand is this:

Claude Opus 4.8 isn't just about a slightly better model. It's about more reliable AI work.

The improvement Anthropic highlights most isn't raw intelligence, it's better honesty: the model is supposed to flag uncertainty more often, and pretend less often that weak results are strong.

Dynamic workflows is the bigger workflow shift. Claude Code goes from being an assistant that helps with one task, to being a coordinator that can plan, split, delegate, check, and gather work.

It's powerful. But it also requires mature use.

The best way to think about this isn't "now AI can do everything alone". It's more like "now AI can take on bigger parts of the work process, but we still have to give it clear goals, good tests, the right limits, and human accountability".

For developers, product teams, and tech leaders, this is a clear signal of where AI tools are heading: from chat, to agent, to coordinated workflow.

Glossary

Term	Definition
Agent	An AI system that can work toward a goal across multiple steps, often by using tools, files, or commands.
Agentic task	A task where the AI has to plan and act across multiple steps, not just answer a single question.
Anthropic	The company behind Claude.
Benchmark	A standardized test for comparing models or systems.
Claude	Anthropic's AI assistant and model family.
Claude Code	Anthropic's coding tool where Claude is used directly in development work.
Dynamic workflows	A feature in Claude Code where Claude can plan large tasks, split them into sub-tasks, run many subagents in parallel, and verify the results.
Effort control	A setting that lets the user tune how much effort Claude puts into a response.
Fast mode	A faster mode for Opus 4.8. Anthropic says fast mode for Opus 4.8 can run at 2.5x the speed and is now three times cheaper than it was for earlier models.
Opus	Anthropic's most advanced Claude model line.
Prompt cache	A technique where parts of a prompt can be reused, so the system doesn't have to reprocess the same information.
Research preview	An early-access version of a feature that is still under development.
Subagent	A smaller AI agent responsible for a sub-task.
Token	A small text unit that the model reads or writes.
Ultracode	A Claude Code-specific setting that pins effort to xhigh and lets Claude automatically decide when a workflow should be used.
Verification	Checking results before they're handed on.

Sources and resources

Anthropic — Introducing Claude Opus 4.8 — Anthropic's official launch post for Opus 4.8.
Claude blog — Introducing dynamic workflows in Claude Code — Anthropic's official introduction of dynamic workflows, including the Bun example.
Claude Code changelog — Ongoing updates to the Claude Code documentation.
Anthropic — Claude Opus 4.8 System Card (PDF) — Official system card documenting the model's behavior, calibration, and honesty.
Reuters on Opus 4.8 — Mythos and the Anthropic launch.
The Verge on Opus 4.8 — Honesty, effort control, and dynamic workflows.
Axios on Opus 4.8 — Anthropic's model strategy.