Build This Now
Build This Now
Claude Code ModelsClaude Opus 4.5 in Claude CodeClaude Opus 4.6Claude Sonnet 4.6Claude Opus 4.5Claude Sonnet 4.5Claude Haiku 4.5Claude Opus 4.1Claude 4Claude 3.7 SonnetClaude 3.5 Sonnet v2 and Claude 3.5 HaikuClaude 3.5 SonnetClaude 3Every Claude Model
Get Build This Now
speedy_devvkoen_salo
Blog/Model Picker/Claude Opus 4.6

Claude Opus 4.6

Anthropic's upgraded Opus flagship ships with 1M context GA, 128K output, and the same $5/$25 pricing.

Opus 4.6 is the upgraded flagship from Anthropic. Planning takes more thought. Agent runs stay coherent for longer. Large codebases feel less hostile. And Claude catches its own bugs before you do. It is also the first Opus-class release to ship with a generally available 1M token window, and response output now reaches 128K tokens.

Coding is the headline change, and the price tag stays flat at $5/$25 per million tokens while scores on the hardest public evals moved up across the board. The raw numbers sit in the benchmark section below.

Key Specs

SpecDetails
API IDclaude-opus-4-6
Release DateFebruary 5, 2026
Context Window1M tokens (GA as of March 2026)
Max Output128,000 tokens
Pricing$5 input / $25 output per 1M tokens
StatusActive, current recommended Opus

What Changed: The Coding Improvements

Anthropic dogfoods Claude. Every Anthropic engineer lives in Claude Code day to day, and nothing ships until it survives the internal use case first. The 4.6 gains are concrete and practical.

Planning is more careful. Before committing to an approach, the model sits with the problem longer, loops back through its own reasoning, notices logic errors sooner, and lands a stronger first draft on hard tasks.

Agent runs stay coherent. Older models drifted after a while. Here, focus holds across long sessions. A workflow that fires off tool call after tool call, dozens deep, reaches the finish line more often now.

Large codebases feel less hostile. Navigating big projects, reading them, and changing them all improved. Claude keeps a clearer picture of layout and house conventions across a long session.

Review and debugging hit harder. Catching its own mistakes is noticeably better, and reviews read more thoroughly. Tracing a bug through a chain of dependencies now needs far less hand-holding from you.

Easy work moves fast. The deeper reasoning gets saved for the hard steps, and Opus 4.6 no longer dwells on the obvious ones. Catch it overthinking something simple? Drop the default from high to medium with /effort.

Benchmark Results

New records landed in several categories.

BenchmarkScoreNotable Comparison
Terminal-Bench 2.065.4%GPT-5.2: 64.7%
GDPval-AA Elo1,606144 Elo above GPT-5.2, 190 above Opus 4.5
Humanity's Last ExamLeadingHighest among all frontier models
BrowseCompLeadingBest at finding hard-to-locate information online
OSWorld72.7%State-of-the-art for computer use
MRCR v2 (8-needle)78.3%Highest among frontier models at 1M context

Inside Claude Code, the benchmark to watch is Terminal-Bench 2.0. It scores real terminal work across coding, sysadmin tasks, and file handling. Top slot here means Opus 4.6 is the strongest pick for what a developer actually does at the command line all day.

GDPval-AA sits on the opposite end of the eval spectrum. It measures knowledge work that drives real economic value, in finance, legal, and the rest of the white-collar stack. The lead over the next-best industry model is wide.

The MRCR v2 number matters for a different reason. "Context rot" is the usual complaint, where answers degrade as the chat stretches on. That drift shrinks here. Across very long windows, Opus 4.6 keeps its grip on small details and pulls up buried facts the prior version missed. The 78.3% score is a real change in how much of the window Claude can put to work.

Humanity's Last Exam tests broad multidisciplinary reasoning, and no frontier model tops Opus 4.6 on it. BrowseComp scores how well the model digs up information that is genuinely hard to find online. OSWorld grades real desktop computer use. The new release takes the crown on all three.

1M Token Context Window and 128K Output

As of March 2026, the full 1M window is generally available, and token pricing is flat end to end. The per-token rate on a 900K-token call matches the rate on a 9K one. No beta header is needed. Any legacy beta headers get silently dropped.

Media limits grew 6x at the GA launch. The per-request ceiling is now 600 images or PDF pages, versus 100 before. Rate limits stay at their full values no matter how long the context gets.

Output also jumped. The ceiling moved from 16K tokens to 128K, which lets Claude finish larger-output jobs in a single call. Whole modules or long analyses can now come back in one response instead of being chopped across many.

Inside Claude Code, the full 1M window ships on by default for Max, Team, and Enterprise plans. Anthropic reports a 15% drop in compaction events, so long conversations now survive end to end without lossy summarization kicking in. Whatever context-management workflow you already use still works. You just bump into the ceiling less often.

Safety Profile

Smarter does not mean less safe. Anthropic runs an automated behavioral audit, and Opus 4.6 scored low on the behaviors that matter: deception, sycophancy, reinforcing user delusions, and going along with misuse. Its alignment sits level with Opus 4.5, the previous record holder for best-aligned frontier release.

Legitimate prompts also come through more often. Opus 4.6 posts the lowest rate of over-refusals in any recent Claude release. Real requests get blocked less.

The cybersecurity number is the headline. On one internal run, the model surfaced 500+ previously unknown high-severity zero-day flaws sitting in open-source libraries. Anthropic is pushing this harder, aiming the model at OSS projects to hunt down and patch the flaws buried inside. Security teams can drop Opus 4.6 into code review as a first-pass vulnerability scanner.

New API and Product Features

The model upgrade landed alongside several new features.

Adaptive thinking. Extended thinking used to be a binary toggle. Claude now picks its own moments to think harder. With effort set to high (the default), extended thinking kicks in wherever it helps. Four tiers are available to developers: low, medium, high (default), and max.

Context compaction (beta). When a long chat drifts toward the context ceiling, Claude now summarizes and compacts it on its own. Long-running tasks keep going instead of running out of room.

Agent teams (Claude Code research preview). Multiple Claude instances can now run in parallel as one coordinated team. Read-heavy jobs that fan out into independent pieces, like codebase reviews, are the sweet spot. Everything else lives in the agent teams guide.

Claude in PowerPoint (research preview). Layouts, fonts, and slide masters all get parsed by Claude so the output stays on brand, whether it is filling a template or spinning a deck up from scratch. Available on the Max, Team, and Enterprise plans.

Pricing

No price bump. The 1M window ships with unified pricing across the whole context length. The old 200K+ premium tier has been retired.

TierCost
All contexts$5 input / $25 output per 1M tokens
Pro plan$20/month
Max plan$100/month

Been on Opus 4.5 with your spend dialed in? The jump to 4.6 is free upside at the old price. And if long-context calls were paying the premium tier, the bill just dropped.

How to Use Opus 4.6 in Claude Code

One command changes the default model:

claude config set model claude-opus-4-6

For a single session, override it without touching the default:

claude --model claude-opus-4-6

The model ships everywhere: claude.ai, the Messages API, AWS Bedrock, and Google Vertex AI. On the API, the identifier to use is claude-opus-4-6.

Opus 4.6 vs Opus 4.5: What Changed

FeatureOpus 4.5Opus 4.6
Context window200K (standard), 1M (API beta)1M (GA, unified pricing)
Max output tokens16,384128,000
Terminal-Bench 2.0Not tested on v2.065.4% (highest)
GDPval-AA Elo1,4161,606 (+190 points)
MRCR v2Not tested78.3%
Over-refusalsLowLowest of any recent model
Adaptive thinkingNot availableBuilt in
Context compactionAuto at 95%Configurable threshold (beta)
Standard pricing$5/$25 per 1M$5/$25 per 1M (unchanged)

Coding quality and longer agent runs are the headline gains. Everything 4.5 already did well carries over too: multi-agent delegation, token efficiency, the effort parameter. Day to day, the practical wins in Claude Code are the bigger output ceiling and adaptive thinking.

Model selection is simple. Reach for Opus 4.6 when reasoning depth is what the job needs. Sonnet is the right call on smaller tasks that want speed over depth. Pricing is now at parity, so there's no reason left on the bill to stick with the older flagship.

More in this guide

  • Every Claude Model
    One page, every Claude release. Specs, prices, benchmarks, and when to actually use each model from Claude 3 through Sonnet 4.6.
  • Claude 3.5 Sonnet v2 and Claude 3.5 Haiku
    October 2024 refresh shipped an upgraded Sonnet, a budget Haiku, and the first Claude model that could drive a desktop cursor.
  • Claude 3.5 Sonnet
    Released June 20, 2024, Anthropic's mid-tier model beat its own larger flagship on most benchmarks at a fifth of the cost.
  • Claude 3.7 Sonnet
    February 2025 release added hybrid reasoning. Claude could now pause, think through a hard problem step by step, then answer.
  • Claude 3
    The March 2024 lineup that split Opus, Sonnet, and Haiku into three tiers and set the template the rest of the industry copied.

Stop configuring. Start building.

SaaS builder templates with AI orchestration.

Get Build This Now

Claude Opus 4.5 in Claude Code

Opus 4.5 shipped with $5/$25 pricing and 76% fewer output tokens than Sonnet 4.5. Set it as your default in two commands.

Claude Sonnet 4.6

Sonnet 4.6 ships at $3/$15 with a 1M context window and a 70% coding win rate over Sonnet 4.5.

On this page

Key Specs
What Changed: The Coding Improvements
Benchmark Results
1M Token Context Window and 128K Output
Safety Profile
New API and Product Features
Pricing
How to Use Opus 4.6 in Claude Code
Opus 4.6 vs Opus 4.5: What Changed

Stop configuring. Start building.

SaaS builder templates with AI orchestration.

Get Build This Now