Claude Opus 4.8 is Anthropic's latest flagship model, released on May 28, 2026. Its model ID is claude-opus-4-8, and it is currently the most capable Claude available to general users. This guide focuses on what changed from 4.7 and how to use it well in practice, not just launch news. Every spec and figure below is based on Anthropic's official announcement and documentation.
Key specs at a glance
Claude Opus 4.8 is an upgrade aimed at stronger performance across coding, agentic tasks, and professional knowledge work compared with 4.7. Foundational specs such as the 1M-token context window and 128K max output stay the same as 4.7, and pricing is unchanged.
- Model ID:
claude-opus-4-8 - Context window: 1M tokens (default on the Claude API, Amazon Bedrock, Vertex AI / 200K on Microsoft Foundry)
- Max output: 128K tokens (synchronous Messages API; up to 300K on the Batch API with a beta header)
- Standard pricing: $5 per 1M input tokens, $25 per 1M output tokens (same as 4.7)
- Released: May 28, 2026 (about six weeks after Opus 4.7)
4.7 → 4.8: what changed
Anthropic described 4.8 as a "modest but tangible" improvement. The main differences are summarized below.
| Item | Opus 4.7 | Opus 4.8 |
|---|---|---|
| Context window | 1M tokens | 1M tokens (same) |
| Max output | 128K tokens | 128K tokens (same) |
| Standard price (in/out, per 1M) | $5 / $25 | $5 / $25 (same) |
| Effort control (claude.ai·Cowork) | None | Added (all plans) |
| Mid-conversation system messages | None | Supported (no beta header) |
| Unremarked flaws in its own code | Baseline | ~4× fewer |
| Fast mode cost | Baseline | ~3× cheaper |
The biggest change: honesty
The improvement Anthropic emphasized most in this release is honesty. AI models often jump to conclusions, confidently claiming progress on thin evidence. Early testers reported that Opus 4.8 is more likely to flag uncertainty about its own work and less likely to make unsupported claims. In Anthropic's own evaluations, Opus 4.8 is about four times less likely than its predecessor to let flaws in code it wrote pass unremarked.
On alignment, Anthropic's alignment team concluded that Opus 4.8 reaches new highs on prosocial traits such as supporting user autonomy, while rates of misaligned behavior (such as deception or cooperation with misuse) are substantially lower than 4.7 and similar to its best-aligned model, Claude Mythos Preview. Full details are in the official System Card.
Effort control — now on every plan
Alongside Opus 4.8, claude.ai and Cowork gained an effort control. Next to the model selector, you can choose how much effort Claude puts into a response. At higher effort Claude thinks more often and more deeply for better quality; at lower effort it responds faster and consumes rate limits more slowly. This control is available on all plans.
The default is high across all surfaces. In Claude Code you can pick xhigh ("extra") or "max" for harder tasks and long-running async work. Note that the token allocation behind each level changed in 4.8 (medium allows somewhat more thinking, high somewhat less, and xhigh substantially more), so if you tuned cost or latency against 4.7, re-baseline at the same level. Opus 4.8 also uses adaptive thinking, triggering reasoning only when a turn needs it, reducing wasted thinking tokens versus 4.7 at the same effort level.
Fast mode: faster and cheaper
Opus 4.8's fast mode runs the same model at up to 2.5× the output speed. According to Anthropic, fast mode for 4.8 is three times cheaper than fast mode on previous models. Pricing is $10 per 1M input tokens and $50 per 1M output tokens, and it is currently a research preview on the Claude API.
What changed for developers
There are no breaking API changes when moving from 4.7 to 4.8. It is designed to perform well out of the box on existing 4.7 prompts and evals; just change the model name from claude-opus-4-7 to claude-opus-4-8 (or update aliases). Do re-check effort, which was recalibrated as described above.
- Mid-conversation system messages: send
role: "system"messages after a user turn during a long session (subject to placement rules). You can update instructions without restating the full system prompt, preserving prompt-cache hits on earlier turns and lowering input cost. No beta header required. - Lower prompt-cache minimum: the minimum cacheable prompt length is now 1,024 tokens, lower than on 4.7. Prompts too short to cache on 4.7 can now create cache entries with no code changes.
- Documented refusal stop details: refusal responses now include a category (cyber, bio, or null) and a human-readable explanation, so you can route different classes of refusal to the right next step.
- High-resolution image input: up to 2,576 pixels on the long edge (same as 4.7).
Also launching alongside
- Dynamic workflows: a research-preview feature in Claude Code where Claude plans the work, runs hundreds of parallel subagents in one session, then verifies outputs before reporting back. Available in Claude Code for Enterprise, Team, and Max plans.
- Effort control: the claude.ai and Cowork effort control described above, available on all plans.
- Messages API system entries: update instructions mid-task without breaking the prompt cache.
Availability and pricing
Claude Opus 4.8 is available everywhere from launch day. On claude.ai it is available to Pro, Max, Team, and Enterprise users; developers can use it via the Claude API and on Amazon Web Services, Google Cloud (Vertex AI), and Microsoft Foundry.
- Standard pricing: $5 input / $25 output (per 1M tokens), unchanged from 4.7
- Up to 90% savings with prompt caching, 50% savings with batch processing
- The full 1M context is offered at standard pricing (a 900K-token request is billed at the same per-token rate as a 9K-token request)
- US-only inference (inference_geo) carries a 1.1× multiplier on input and output
Practical tips
- Coding and high-autonomy work: the default high is usually enough in Claude Code, but set
xhighexplicitly for hard tasks and long-running async work. - Upgrading from 4.7: change the model name, then re-measure cost and latency at the same effort level before adjusting.
- Long agent loops: use mid-conversation system messages to update permissions, token budgets, or environment context mid-run while keeping the cache.
- Lean on honesty: 4.8 tends to flag problems in inputs and outputs first. In analysis and review work, treat the uncertainties and warnings it raises as checkpoints rather than ignoring them.
What's next
Anthropic frames Opus 4.8 as a modest but tangible step over its predecessor, and says it is working on models that deliver Opus-class capabilities at lower cost, as well as a new class of model with intelligence beyond Opus. The latter is the Claude Mythos Preview family, which a small number of organizations currently use for cybersecurity work as part of Project Glasswing; Anthropic says it expects to bring Mythos-class models to all customers "in the coming weeks" once stronger cyber safeguards are in place. (Timelines are per Anthropic's statements and may change.)
Sources: Anthropic, "Introducing Claude Opus 4.8" (May 28, 2026); Claude Platform official docs (models overview, migration guide, pricing, context windows). Some benchmark figures cited above come from early-tester comments quoted in Anthropic's announcement.