Build This Now
Build This Now
Claude Code ModelsClaude Opus 4.5 in Claude CodeClaude Opus 4.6Claude Sonnet 4.6Claude Opus 4.5Claude Sonnet 4.5Claude Haiku 4.5Claude Opus 4.1Claude 4Claude 3.7 SonnetClaude 3.5 Sonnet v2 and Claude 3.5 HaikuClaude 3.5 SonnetClaude 3Every Claude Model
Get Build This Now
speedy_devvkoen_salo
Blog/Model Picker/Claude Opus 4.5 in Claude Code

Claude Opus 4.5 in Claude Code

Opus 4.5 shipped with $5/$25 pricing and 76% fewer output tokens than Sonnet 4.5. Set it as your default in two commands.

Your Claude Code bill is mostly output tokens. Opus 4.5 cuts that bill by 67% and writes cleaner code while doing it. Here is how to turn it on and what changes once you do.

Quick Win: set Opus 4.5 as the default model and open a session:

claude config set model claude-opus-4-5-20251101
claude

You are now running the most token-efficient coding model available.

Token Efficiency

This is not marketing copy. GitHub reports Opus 4.5 "surpasses internal coding benchmarks while cutting token usage in half." Replit says it "beats Sonnet 4.5 and competition on our internal benchmarks, using fewer tokens to solve the same problems."

Here is what that looks like day to day:

MetricImprovement
Output tokens vs Sonnet 4.576% reduction
Tool calls per task50% fewer
Long-running tasksUp to 65% reduction
With Tool Search enabled85% reduction

Fewer tokens means faster answers, lower cost, and more room before you hit the context limit.

Built for Sub-Agent Delegation

Opus 4.5 writes better prompts for sub-agents than any other Claude model. Anthropic trained it for delegation on purpose.

This pays off when you run parallel agents for testing, code generation, or task distribution. The lead agent hands work out more cleanly:

# Example: Running parallel browser tests
claude "Run 4 parallel test agents against staging -
test login flow, checkout, search, and user settings"

The model handles the coordination. Each sub-agent gets clear, specific instructions. Results come back to you without the chaos of earlier models.

The Effort Parameter

New API control for trading speed against thoroughness. Set it per call without switching models:

const response = await anthropic.messages.create({
  model: "claude-opus-4-5-20251101",
  max_tokens: 8192,
  thinking: {
    type: "enabled",
    budget_tokens: 10000, // Low: 1024, Medium: 5000, High: 10000+
  },
  messages: [{ role: "user", content: prompt }],
});

Low effort for quick questions. High effort for big refactors. You decide the thinking budget per call.

Auto-Compaction for Long Sessions

Hit 95% of your 200K context window? Claude compacts earlier messages automatically while keeping your full chat history. Alex Albert calls it "effectively infinite context."

Manual control is there when you want it:

/compact

Best practice: compact at logical milestones rather than waiting for the automatic trigger. You keep more detail in the parts that matter.

When Things Go Wrong

Error: "model not found". Update your Claude Code install:

npm update -g @anthropic-ai/claude-code

Error: "rate limit exceeded". Opus 4.5 has separate limits from Sonnet. Check your plan tier or add a short delay between requests.

Error: "context too long". Run /compact by hand or split the task into smaller chunks. See memory optimization for deeper patterns.

What This Means for Your Workflow

Opus 4.5 is not just a version bump. It is a different way to work:

  • Delegate more. Hand off complex coordination you would not trust to earlier models.
  • Run longer sessions. Token efficiency means more work before compaction kicks in.
  • Pay less. A 67% cost drop at the same or better quality.

The model scores 80.9% on SWE-bench Verified (a new high) and leads across 7 of 8 programming languages. Your code works the first try, not the fifth.

Related Pages

  • Model selection for when to use Opus versus Sonnet
  • Sub-agent design patterns for getting the most out of delegation
  • Efficiency patterns for production workflows

Update: Claude Opus 4.6 is now available with 1M token context and native agent teams. See the complete model timeline for all Claude models.

More in this guide

  • Every Claude Model
    One page, every Claude release. Specs, prices, benchmarks, and when to actually use each model from Claude 3 through Sonnet 4.6.
  • Claude 3.5 Sonnet v2 and Claude 3.5 Haiku
    October 2024 refresh shipped an upgraded Sonnet, a budget Haiku, and the first Claude model that could drive a desktop cursor.
  • Claude 3.5 Sonnet
    Released June 20, 2024, Anthropic's mid-tier model beat its own larger flagship on most benchmarks at a fifth of the cost.
  • Claude 3.7 Sonnet
    February 2025 release added hybrid reasoning. Claude could now pause, think through a hard problem step by step, then answer.
  • Claude 3
    The March 2024 lineup that split Opus, Sonnet, and Haiku into three tiers and set the template the rest of the industry copied.

Stop configuring. Start building.

SaaS builder templates with AI orchestration.

Get Build This Now

Claude Code Models

A working guide to picking Sonnet, Opus, Haiku, or opusplan in Claude Code, cutting model costs 60-80% without losing quality.

Claude Opus 4.6

Anthropic's upgraded Opus flagship ships with 1M context GA, 128K output, and the same $5/$25 pricing.

On this page

Token Efficiency
Built for Sub-Agent Delegation
The Effort Parameter
Auto-Compaction for Long Sessions
When Things Go Wrong
What This Means for Your Workflow
Related Pages

Stop configuring. Start building.

SaaS builder templates with AI orchestration.

Get Build This Now