Claude Opus 4.8: The Complete Guide to What's New and How to Use It

Claude Opus 4.8 is Anthropic's latest flagship model, released on May 28, 2026. Its model ID is claude-opus-4-8, and it is currently the most capable Claude available to general users. This guide focuses on what changed from 4.7 and how to use it well in practice, not just launch news. Every spec and figure below is based on Anthropic's official announcement and documentation.

Key specs at a glance

Claude Opus 4.8 is an upgrade aimed at stronger performance across coding, agentic tasks, and professional knowledge work compared with 4.7. Foundational specs such as the 1M-token context window and 128K max output stay the same as 4.7, and pricing is unchanged.

Model ID: claude-opus-4-8
Context window: 1M tokens (default on the Claude API, Amazon Bedrock, Vertex AI / 200K on Microsoft Foundry)
Max output: 128K tokens (synchronous Messages API; up to 300K on the Batch API with a beta header)
Standard pricing: $5 per 1M input tokens, $25 per 1M output tokens (same as 4.7)
Released: May 28, 2026 (about six weeks after Opus 4.7)

4.7 → 4.8: what changed

Anthropic described 4.8 as a "modest but tangible" improvement. The main differences are summarized below.

Item	Opus 4.7	Opus 4.8
Context window	1M tokens	1M tokens (same)
Max output	128K tokens	128K tokens (same)
Standard price (in/out, per 1M)	$5 / $25	$5 / $25 (same)
Effort control (claude.ai·Cowork)	None	Added (all plans)
Mid-conversation system messages	None	Supported (no beta header)
Unremarked flaws in its own code	Baseline	~4× fewer
Fast mode cost	Baseline	~3× cheaper

The biggest change: honesty

The improvement Anthropic emphasized most in this release is honesty. AI models often jump to conclusions, confidently claiming progress on thin evidence. Early testers reported that Opus 4.8 is more likely to flag uncertainty about its own work and less likely to make unsupported claims. In Anthropic's own evaluations, Opus 4.8 is about four times less likely than its predecessor to let flaws in code it wrote pass unremarked.

On alignment, Anthropic's alignment team concluded that Opus 4.8 reaches new highs on prosocial traits such as supporting user autonomy, while rates of misaligned behavior (such as deception or cooperation with misuse) are substantially lower than 4.7 and similar to its best-aligned model, Claude Mythos Preview. Full details are in the official System Card.

Effort control — now on every plan

Alongside Opus 4.8, claude.ai and Cowork gained an effort control. Next to the model selector, you can choose how much effort Claude puts into a response. At higher effort Claude thinks more often and more deeply for better quality; at lower effort it responds faster and consumes rate limits more slowly. This control is available on all plans.

The default is high across all surfaces. In Claude Code you can pick xhigh ("extra") or "max" for harder tasks and long-running async work. Note that the token allocation behind each level changed in 4.8 (medium allows somewhat more thinking, high somewhat less, and xhigh substantially more), so if you tuned cost or latency against 4.7, re-baseline at the same level. Opus 4.8 also uses adaptive thinking, triggering reasoning only when a turn needs it, reducing wasted thinking tokens versus 4.7 at the same effort level.

Fast mode: faster and cheaper

Opus 4.8's fast mode runs the same model at up to 2.5× the output speed. According to Anthropic, fast mode for 4.8 is three times cheaper than fast mode on previous models. Pricing is $10 per 1M input tokens and $50 per 1M output tokens, and it is currently a research preview on the Claude API.

What changed for developers

There are no breaking API changes when moving from 4.7 to 4.8. It is designed to perform well out of the box on existing 4.7 prompts and evals; just change the model name from claude-opus-4-7 to claude-opus-4-8 (or update aliases). Do re-check effort, which was recalibrated as described above.

Mid-conversation system messages: send role: "system" messages after a user turn during a long session (subject to placement rules). You can update instructions without restating the full system prompt, preserving prompt-cache hits on earlier turns and lowering input cost. No beta header required.
Lower prompt-cache minimum: the minimum cacheable prompt length is now 1,024 tokens, lower than on 4.7. Prompts too short to cache on 4.7 can now create cache entries with no code changes.
Documented refusal stop details: refusal responses now include a category (cyber, bio, or null) and a human-readable explanation, so you can route different classes of refusal to the right next step.
High-resolution image input: up to 2,576 pixels on the long edge (same as 4.7).

Also launching alongside

Dynamic workflows: a research-preview feature in Claude Code where Claude plans the work, runs hundreds of parallel subagents in one session, then verifies outputs before reporting back. Available in Claude Code for Enterprise, Team, and Max plans.
Effort control: the claude.ai and Cowork effort control described above, available on all plans.
Messages API system entries: update instructions mid-task without breaking the prompt cache.

Availability and pricing

Claude Opus 4.8 is available everywhere from launch day. On claude.ai it is available to Pro, Max, Team, and Enterprise users; developers can use it via the Claude API and on Amazon Web Services, Google Cloud (Vertex AI), and Microsoft Foundry.

Standard pricing: $5 input / $25 output (per 1M tokens), unchanged from 4.7
Up to 90% savings with prompt caching, 50% savings with batch processing
The full 1M context is offered at standard pricing (a 900K-token request is billed at the same per-token rate as a 9K-token request)
US-only inference (inference_geo) carries a 1.1× multiplier on input and output

Practical tips

Coding and high-autonomy work: the default high is usually enough in Claude Code, but set xhigh explicitly for hard tasks and long-running async work.
Upgrading from 4.7: change the model name, then re-measure cost and latency at the same effort level before adjusting.
Long agent loops: use mid-conversation system messages to update permissions, token budgets, or environment context mid-run while keeping the cache.
Lean on honesty: 4.8 tends to flag problems in inputs and outputs first. In analysis and review work, treat the uncertainties and warnings it raises as checkpoints rather than ignoring them.

What's next

Anthropic frames Opus 4.8 as a modest but tangible step over its predecessor, and says it is working on models that deliver Opus-class capabilities at lower cost, as well as a new class of model with intelligence beyond Opus. The latter is the Claude Mythos Preview family, which a small number of organizations currently use for cybersecurity work as part of Project Glasswing; Anthropic says it expects to bring Mythos-class models to all customers "in the coming weeks" once stronger cyber safeguards are in place. (Timelines are per Anthropic's statements and may change.)

Sources: Anthropic, "Introducing Claude Opus 4.8" (May 28, 2026); Claude Platform official docs (models overview, migration guide, pricing, context windows). Some benchmark figures cited above come from early-tester comments quoted in Anthropic's announcement.

Claude Opus 4.8: The Complete Guide to What's New and How to Use It

Key specs at a glance

4.7 → 4.8: what changed

The biggest change: honesty

Effort control — now on every plan

Fast mode: faster and cheaper

What changed for developers

Also launching alongside

Availability and pricing

Practical tips

What's next

Keep reading

Claude Models Compared: Haiku vs Sonnet vs Opus, and Which to Use

Claude vs Gemini: Key Differences and Which to Choose

Claude vs ChatGPT: Key Differences and When to Use Each (2026)

Claude Opus 4.7: The Complete Guide to New Features and Specs

Have a question or want to share how you use Claude?