Build This Now
Build This Now
Claude Code ModelsClaude Opus 4.5 in Claude CodeClaude Opus 4.7Claude Opus 4.7 Use CasesClaude Opus 4.6Claude Sonnet 4.6Claude Opus 4.5Claude Sonnet 4.5Claude Haiku 4.5Claude Opus 4.1Claude 4Claude 3.7 SonnetClaude 3.5 Sonnet v2 and Claude 3.5 HaikuClaude 3.5 SonnetClaude 3Every Claude Model
Get Build This Now
speedy_devvkoen_salo
Blog/Model Picker/Claude Opus 4.7

Claude Opus 4.7

Claude Opus 4.7 is Anthropic's April 16, 2026 flagship for Claude Code: stronger on hard coding, cyber-adjacent workflows, document reasoning, and long-running agentic tasks at the same $5/$25 pricing as Opus 4.6.

Claude Opus 4.7 is the first Claude release in a while that feels bigger than a point upgrade. The price is unchanged. The 1M context window is unchanged. The 128K output ceiling is unchanged. What changed is the part people actually feel inside Claude Code: the model is better at hard, ambiguous, long-running work that used to need constant supervision.

That shows up in three places.

  • It catches more of its own mistakes before acting.
  • It stays coherent for longer inside multi-step agent loops.
  • It lands better across domains that are not just "write code": cyber workflows, code review, dense screenshots, enterprise docs, contracts, diagrams, and other ambiguous source-heavy work.

If you already live in Claude Code, the short answer is simple: Opus 4.7 is the new default upgrade for high-stakes engineering sessions. If you want the workflow advice, read the dedicated Opus 4.7 best practices guide. If you want concrete examples by domain, read the companion Opus 4.7 use cases page.

Quick Verdict

Use Opus 4.7 when the work is expensive to get wrong:

  • complex refactors across many files
  • debugging with incomplete or conflicting evidence
  • code review where subtle bugs matter
  • cyber-defense, vulnerability research, or security auditing
  • document-heavy work in legal, finance, and operations
  • multimodal tasks with dense screenshots, diagrams, or UI mocks

Stay on Sonnet for smaller daily edits where speed and cost matter more than maximum reasoning depth.

Key Specs

SpecDetails
API IDclaude-opus-4-7
Release dateApril 16, 2026
Context window1M tokens
Max output128,000 tokens
Pricing$5 input / $25 output per 1M tokens
Thinking modeAdaptive thinking
Effort levelslow, medium, high, xhigh, max
Claude Code default effortxhigh
Knowledge cutoffJanuary 2026
StatusCurrent Opus flagship

Claude Opus 4.7 vs Opus 4.6

The basic story is not "a bit smarter." It is "more reliable on the hard slice of work."

AreaOpus 4.6Opus 4.7
CursorBench58%70%
Rakuten-SWE-BenchBaseline3x more production tasks resolved
XBOW visual-acuity54.5%98.5%
OfficeQA ProBaseline21% fewer errors
BigLaw BenchLower90.9% at high effort
Notion Agent tool errorsBaselineabout one third as many
Resolution support1568px / 1.15MP2576px / 3.75MP
Default Claude Code efforthighxhigh
Thinking controladaptive, older migration pathadaptive only, fixed-budget thinking removed
Tool use stylemore tool-happymore selective, more reasoning-first
Subagent behaviordelegates more freelydelegates more selectively

The important part is behavioral, not just numerical. Anthropic and launch partners describe Opus 4.7 as more literal about instructions, more willing to verify assumptions, better at pushing through long tasks, and less likely to loop or fail silently halfway through.

What Actually Improved in Practice

1. Self-verification shows up more often

One of the clearest launch signals is that Opus 4.7 does more checking before it commits to an answer or a code change.

Anthropic's launch page includes Vercel describing a new behavior: the model does proofs on systems code before starting work. Hex says it is better at admitting when data is missing instead of inventing plausible fallback logic. That matters because a lot of real engineering pain is not syntax failure. It is confident-but-wrong reasoning on incomplete context.

Inside Claude Code, this tends to look like:

  • reading one more file before editing
  • checking a call site before changing a type
  • confirming an assumption about state shape or schema
  • pausing to validate a concurrency or migration path

That extra step is often the difference between a clean first pass and a 40-minute loop.

2. Long-running agentic work derails less

Devin reported that Opus 4.7 works coherently for hours and pushes through difficult tasks instead of giving up early. Notion reported a 14% gain on complex multi-step workflows with roughly one third of the tool errors of Opus 4.6. Genspark called out loop resistance, consistency, and graceful recovery as the three production traits that matter most.

That makes Opus 4.7 a better fit for:

  • longer refactors
  • async coding agents
  • CI and automation workflows
  • service-wide review passes
  • investigations where the model has to read, compare, and revise repeatedly

3. Hard coding moved, not just easy coding

CursorBench climbing from 58% to 70% matters because it is closer to the vague, messy, real prompts developers actually hand to coding agents. Rakuten's 3x improvement on production SWE tasks matters because it suggests the gain is not limited to toy examples or benchmark-friendly problems.

CodeRabbit reported over 10% better recall on review workloads while keeping precision stable. Warp and Qodo both called out harder bug classes that 4.7 now catches or resolves. Factory reported a 10-15% lift in task success for Droids with fewer tool errors and more reliable follow-through.

The pattern is consistent: Opus 4.7 is not just "more eloquent." It clears a harder class of engineering work.

4. Dense vision inputs are finally first-class

The resolution jump is one of the most underrated changes in the release. Moving from 1568px / 1.15MP to 2576px / 3.75MP is not cosmetic. It changes what you can trust the model to read without cropping.

That especially helps when the input is:

  • a packed dashboard screenshot
  • a terminal capture with small text
  • a technical diagram
  • a design mockup with dense labels
  • a scanned contract table or document excerpt
  • a chemistry or life-sciences figure

XBOW's visual-acuity jump from 54.5% to 98.5% is the sharpest proof that the added pixels translate into real utility.

5. It is stronger outside pure coding

Anthropic's release positioned Opus 4.7 as stronger on coding, enterprise workflows, and long-running agentic tasks. The partner examples back that up:

  • Cybersecurity: XBOW says their biggest visual pain point in autonomous pentesting effectively disappeared.
  • Legal: Harvey reports 90.9% on BigLaw Bench at high effort, with better reasoning on ambiguous edits and review tables.
  • Docs and enterprise reasoning: Databricks reported 21% fewer errors on OfficeQA Pro.
  • Finance and research: Applied AI testers highlighted stronger disclosure discipline and better long-context performance.
  • Life sciences: Solve Intelligence called out gains on chemical structures and technical diagrams.
  • Design and UI: Lovable said the design taste is strong enough that the model makes choices they would actually ship.

That makes Opus 4.7 a broader "high-stakes knowledge work" model, not just a coding model.

Benchmark Results That Matter

The full benchmark wall is useful for launch day, but only some numbers map cleanly to user value.

BenchmarkWhy it matters
CursorBench: 70%Closer to real coding-agent prompts than narrow coding evals
Rakuten-SWE-Bench: 3x more resolvedSignals movement on production engineering tasks, not just toy repos
XBOW visual-acuity: 98.5%Proves dense image understanding is materially better
BigLaw Bench: 90.9%Strong signal for contract and legal-review use cases
OfficeQA Pro: 21% fewer errorsUseful proxy for enterprise docs and document reasoning
Notion Agent: +14%, fewer tool errorsGood indicator for multi-step agent reliability
CodeRabbit: recall +10%Strong signal for review and bug-finding workflows

If you are choosing a model for Claude Code, CursorBench, Rakuten, Notion, CodeRabbit, and XBOW are the most actionable signals in this release.

Where Opus 4.7 Lands Hardest

Claude Code engineering sessions

This is the obvious one. Opus 4.7 is better when the task is vague, multi-file, or expensive to redo. API migrations, cross-cutting refactors, concurrency bugs, architecture reviews, and codebase-wide cleanups all benefit from the model being more literal, more patient, and more verification-heavy.

Security and cyber-defense workflows

Opus 4.7 matters in security because coding capability and cyber capability are now tightly linked. Project Glasswing, announced on April 7, 2026, is about Mythos Preview, not Opus 4.7. But Anthropic explicitly references Glasswing in the April 16, 2026 Opus 4.7 launch to explain why new cyber safeguards matter here: Opus 4.7 is the first public model where they are testing some of those safeguards in the real world.

That gives you two conclusions:

  • the model is strong enough to be useful for serious defensive security work
  • the model is strong enough that Anthropic is actively constraining risky misuse

If you do legitimate vulnerability research, penetration testing, or red-teaming, Anthropic points professionals toward the Cyber Verification Program.

Legal, finance, and enterprise operations

Opus 4.7 is a strong fit when the work is: compare, verify, summarize, and avoid hallucinating the missing pieces. Contracts, audit trails, review tables, financial memos, policy docs, and internal operating documents all benefit from the model's stronger calibration and document reasoning.

Multimodal product, design, and R&D work

Better screenshot reading and diagram handling make it more useful for design critique, product QA, life sciences workflows, patents, and technical documentation. If the source material used to require manual zooming or cropping, Opus 4.7 is much more usable.

For more concrete domain examples and prompt ideas, see Claude Opus 4.7 use cases.

Cyber, Risk, and Safety: Why This Release Is Different

Anthropic's launch messaging around Opus 4.7 is unusual because it does not just celebrate capability. It places the release inside a live cyber-risk story.

Anthropic says Opus 4.7 is less capable than Mythos Preview, but still strong enough that they experimented during training with differentially reducing cyber capabilities relative to Mythos. They also shipped automated safeguards that detect and block requests indicating prohibited or high-risk cybersecurity use.

That matters for anyone writing about the model because it changes the angle:

  • Opus 4.7 is not just a faster copilot.
  • It sits in the category where cyber benefit and cyber risk now move together.
  • Defensive workflows are a legitimate strength area.
  • Unsafe or disallowed offensive workflows are an explicit deployment concern.

In practical terms, that means you should position Opus 4.7 as strong for:

  • secure code review
  • defensive audit passes
  • threat modeling
  • vulnerability triage
  • pentest support inside approved programs
  • security documentation and remediation planning

Not as a generic "do anything cyber" engine.

Vision: The 3x Resolution Upgrade

Opus 4.7 is the first Claude release where the image pipeline deserves its own buying decision.

The new resolution ceiling means:

  • less cropping before sending screenshots
  • better reliability on small text and dense UIs
  • stronger interpretation of technical diagrams
  • cleaner mapping from returned coordinates to real pixels

The trade-off is token cost. Anthropic notes that a full-resolution image can consume roughly 4,784 tokens instead of the roughly 1,600-token range people were used to. For image-heavy workflows, downsampling is now part of cost control.

Best Practices for Opus 4.7 in Claude Code

Anthropic's own guidance for Opus 4.7 inside Claude Code is more behavioral than technical. The theme is: delegate better, batch context earlier, and reduce unnecessary back-and-forth.

The high-signal habits are:

  • put the real task in the first turn: intent, constraints, file paths, acceptance criteria
  • reduce user turns where possible, because interactive back-and-forth adds reasoning overhead
  • keep xhigh as the default for serious coding work
  • drop to high when you need to control spend across many parallel sessions
  • reserve max for very hard work and eval-style ceiling testing
  • tell the model explicitly when to use tools and when to fan out to subagents
  • use auto mode when the task is well-scoped and you trust the overall direction
  • start a fresh session when the task changes, instead of dragging stale context forward

The full workflow version of that is in Claude Opus 4.7 best practices.

Migration Notes from Opus 4.6

If you are moving API workloads from 4.6 to 4.7, do not just swap the model name and ship.

Adaptive thinking replaces fixed-budget thinking

The older thinking: { type: "enabled", budget_tokens: N } flow is gone for Opus 4.7. Use adaptive thinking and effort levels instead.

Non-default sampling parameters are gone

If your code still sets temperature, top_p, or top_k away from default values, Opus 4.7 returns a 400. Remove those knobs and shape behavior through prompting and effort.

Thinking display changed

Thinking blocks are empty by default unless you explicitly opt in to summarized display. If your UI depended on visible thinking text, you need to update it.

The tokenizer changed

Anthropic says the same input can map to roughly 1.0x to 1.35x the prior token count depending on content. Re-baseline cost and token estimates before assuming old budgets still apply.

High-resolution images cost more

If you were previously sending screenshots casually, 4.7 makes image quality much better and image token cost materially higher. Treat downsampling as a conscious lever.

Task budgets are worth testing

Anthropic introduced task budgets as a public beta so models can self-pace across a full agentic run. If you run longer loops, test them now rather than waiting until a runaway session bites you.

Pricing and Cost

Opus 4.7 kept the same headline pricing as Opus 4.6:

TierCost
Input$5 per 1M tokens
Output$25 per 1M tokens

That does not mean cost is identical in practice.

Your real bill is shaped by:

  • the new tokenizer
  • higher reasoning spend at higher effort levels
  • more expensive full-resolution images
  • whether you run interactive multi-turn sessions or one-shot delegated tasks

The optimistic reading comes from launch partners like Hex and Replit: better quality at lower effort can offset a chunk of the raw token increase. The correct move is not to assume. Measure on real workloads.

Should You Upgrade to Claude Opus 4.7?

Yes, if your pain points are:

  • agents that stop halfway through
  • models that sound plausible but guess too much
  • hard code review and debugging work
  • dense visual or document inputs
  • multi-step workflows with tools

Maybe not immediately, or not as your default, if your workload is mostly:

  • small edit cycles
  • cheap bulk automation
  • low-risk content generation
  • quick Q&A where Sonnet already lands

For most serious Claude Code users, the right strategy is simple: keep Sonnet as the fast everyday option, and use Opus 4.7 as the flagship for intelligence-sensitive work.

Frequently Asked Questions

Is Claude Opus 4.7 worth it over Opus 4.6?

For hard engineering, review, document-heavy, and long-running agentic work, yes. The most important gains are not the raw benchmark numbers. They are the better calibration, stronger self-verification, lower tool-error rate, and better behavior on ambiguous tasks.

What is the best Claude Code effort setting for Opus 4.7?

xhigh is the default in Claude Code and the right starting point for most serious coding sessions. Use high when you need better cost control across many sessions. Use max deliberately for the hardest work, not as a blanket default.

Is Claude Opus 4.7 better for cybersecurity?

It is better for legitimate defensive security workflows, code review, vulnerability triage, and cyber-adjacent analysis. Anthropic also shipped explicit cyber safeguards with the model, which is part of why the release matters.

Does Opus 4.7 cost more than Opus 4.6?

List price is unchanged, but practical cost can rise because of the new tokenizer, higher reasoning spend at higher effort, and more expensive image inputs. Measure against your actual workloads.

When should I still use Sonnet instead of Opus 4.7?

Use Sonnet for fast daily coding, smaller edits, cheaper bulk work, and sessions where speed matters more than frontier-level reasoning.

Sources

  • Introducing Claude Opus 4.7
  • Best practices for using Claude Opus 4.7 with Claude Code
  • Using Claude Code: session management and 1M context
  • Project Glasswing
  • Claude Code best practices docs

Related Pages

  • Claude Opus 4.7 use cases
  • Claude Opus 4.7 best practices
  • Claude Opus 4.6
  • Claude Code Models
  • Claude Code Pricing and Token Usage

More in this guide

  • Every Claude Model
    One page, every Claude release. Specs, prices, benchmarks, and when to actually use each model from Claude 3 through Sonnet 4.6.
  • Claude 3.5 Sonnet v2 and Claude 3.5 Haiku
    October 2024 refresh shipped an upgraded Sonnet, a budget Haiku, and the first Claude model that could drive a desktop cursor.
  • Claude 3.5 Sonnet
    Released June 20, 2024, Anthropic's mid-tier model beat its own larger flagship on most benchmarks at a fifth of the cost.
  • Claude 3.7 Sonnet
    February 2025 release added hybrid reasoning. Claude could now pause, think through a hard problem step by step, then answer.
  • Claude 3
    The March 2024 lineup that split Opus, Sonnet, and Haiku into three tiers and set the template the rest of the industry copied.

Stop configuring. Start building.

SaaS builder templates with AI orchestration.

Get Build This Now

Claude Opus 4.5 in Claude Code

Opus 4.5 shipped with $5/$25 pricing and 76% fewer output tokens than Sonnet 4.5. Set it as your default in two commands.

Claude Opus 4.7 Use Cases

Real Claude Opus 4.7 workflows across coding, security, legal, finance, document reasoning, multimodal review, and long-running Claude Code agents.

On this page

Quick Verdict
Key Specs
Claude Opus 4.7 vs Opus 4.6
What Actually Improved in Practice
1. Self-verification shows up more often
2. Long-running agentic work derails less
3. Hard coding moved, not just easy coding
4. Dense vision inputs are finally first-class
5. It is stronger outside pure coding
Benchmark Results That Matter
Where Opus 4.7 Lands Hardest
Claude Code engineering sessions
Security and cyber-defense workflows
Legal, finance, and enterprise operations
Multimodal product, design, and R&D work
Cyber, Risk, and Safety: Why This Release Is Different
Vision: The 3x Resolution Upgrade
Best Practices for Opus 4.7 in Claude Code
Migration Notes from Opus 4.6
Adaptive thinking replaces fixed-budget thinking
Non-default sampling parameters are gone
Thinking display changed
The tokenizer changed
High-resolution images cost more
Task budgets are worth testing
Pricing and Cost
Should You Upgrade to Claude Opus 4.7?
Frequently Asked Questions
Is Claude Opus 4.7 worth it over Opus 4.6?
What is the best Claude Code effort setting for Opus 4.7?
Is Claude Opus 4.7 better for cybersecurity?
Does Opus 4.7 cost more than Opus 4.6?
When should I still use Sonnet instead of Opus 4.7?
Sources
Related Pages

Stop configuring. Start building.

SaaS builder templates with AI orchestration.

Get Build This Now