Build This Now
Build This Now
Modelos do Claude CodeDeepSeek V4: Pricing, Context, and MigrationRegressão de Qualidade do Claude Code: O Que Realmente AconteceuClaude Opus 4.7 vs GPT-5.5Claude Opus 4.7 vs Outros Modelos de IAClaude Mythos: O Modelo que Pensa em CiclosClaude Opus 4.5 no Claude CodeClaude Opus 4.7Claude Opus 4.7 vs 4.6Casos de Uso do Claude Opus 4.7Claude Opus 4.6Claude Sonnet 4.6Claude Opus 4.5Claude Sonnet 4.5Claude Haiku 4.5Claude Opus 4.1Claude 4Claude 3.7 SonnetClaude 3.5 Sonnet v2 e Claude 3.5 HaikuClaude 3.5 SonnetClaude 3Todos os Modelos Claude
speedy_devvkoen_salo
Blog/Model Picker/Claude Opus 4.7 vs 4.6

Claude Opus 4.7 vs 4.6

Anthropic kept the $5/$25 rate card on Opus 4.7, but a new tokenizer pushes the same English prompt up to 1.46x more billed tokens, raising real bills 12 to 27 percent after caching.

Pare de configurar. Comece a construir.

Templates SaaS com orquestração de IA.

Published May 15, 20269 min readModel Picker hub

The Opus 4.7 sticker price reads identical to Opus 4.6. Five dollars per million input. Twenty-five per million output. What changed sits one layer below the price page: the tokenizer counts the same English text differently, so the same job bills more on 4.7.

Simon Willison ran one system prompt through Anthropic's own count_tokens API on both models. Opus 4.6 returned 5,039 tokens. Opus 4.7 returned 7,335. That is a 1.46x ratio on plain English text, and it sits 11 points above Anthropic's stated 1.0 to 1.35x upper bound (per Simon Willison's token counter writeup).

OpenRouter then ran the same comparison across one million real production requests. Their finding: 12 to 27 percent more billed tokens for typical 2K to 25K prompts after prompt caching, and 32 to 45 percent more without caching (per OpenRouter's tokenizer analysis).

One day before this post went live, Claude Code v2.1.142 flipped fast mode default to 4.7. So your next session starts paying the new rate by default unless you override it.

The Side-by-Side

Pull up the two model spec sheets next to each other and the rate card looks frozen. The work below the price line is where 4.7 paid for itself, and where it costs you more:

DimensionOpus 4.6Opus 4.7Source
API model IDclaude-opus-4-6claude-opus-4-7Anthropic docs
Input price (per 1M)$5$5Anthropic pricing
Output price (per 1M)$25$25Anthropic pricing
Context window1M tokens1M tokensAnthropic docs
Max output128K tokens128K tokensAnthropic docs
Tokenizer (text)1.0x baseline1.0 to 1.46xWillison, OpenRouter
Effective bill change (2K to 25K, cached)baseline+21 to +27 percentOpenRouter
Effective bill change (under 2K)baseline-1.6 percentOpenRouter
SWE-bench Verified80.8%87.6% (+6.8)Vellum
SWE-bench Pro53.4%64.3% (+10.9)Vellum
Terminal-Bench 2.065.4%69.4% (+4.0)Vellum
GPQA Diamond91.3%94.2% (+2.9)Vellum
OSWorld-Verified72.7%78.0% (+5.3)Vellum
CharXiv (no tools)69.1%82.1% (+13.0)Vellum
BrowseComp (agentic search)83.7%79.3% (-4.4)Vellum
Output speedsimilar68.2 tokens/secArtificial Analysis
TTFT (standard)similar16.26 secArtificial Analysis
Fast mode2.5x speed at 6x price2.5x speed at 6x priceAnthropic announcement
Vision max edgesmaller2,576 px (~3.75 MP)Anthropic announcement
Effort levelshigh, maxadds xhighAnthropic announcement
Claude Code fast default (2026-05-14)n/a4.7Claude Code CHANGELOG

Three rows do the heavy lifting here. SWE-bench Pro climbed nearly eleven points, which is the largest jump on the table for hard agent coding work. CharXiv (no tools) went up thirteen points, so chart and figure parsing got a real upgrade. BrowseComp regressed 4.4 points, so anything that looks like agentic web research got a little worse, not better.

What a Real Call Costs Now

Take a representative Claude Code build call. The agent reads ~25K tokens of context (your repo plus a spec) and writes ~4K tokens of code:

On Opus 4.6:

input:  25,000 tokens × $5/1M  = $0.125
output:  4,000 tokens × $25/1M = $0.100
total                          = $0.225

On Opus 4.7, the same English source gets recounted by the new tokenizer. OpenRouter measured native inflation at 1.34x for 25K to 50K prompts, and 13 to 30 percent longer completions at long context. Apply both:

input:  25,000 × 1.34 = 33,500 tokens × $5/1M  = $0.1675
output:  4,000 × 1.13 =  4,520 tokens × $25/1M = $0.1130
total                                          = $0.2805

A 24.7 percent effective increase for the exact same job. No rate-card change. No menu warning. The tokenizer ate your margin.

Run that math on a working day. A solo founder pushing 200 agent calls daily moves from $45 to $56. Same code shipped. Same prompts. About $330 a month in extra spend you did not opt into.

Two cases where the math flips. For long-context calls at 128K and up, prompt caching absorbs about 93 percent of the inflation, so the effective gap shrinks to roughly 15 percent. For short calls under 2K tokens, completions get 62 percent shorter on 4.7, so the bill is actually 1.6 percent cheaper. Mid-range prompts are where the cost lands hardest.

The Default Flip Nobody Warned You About

Claude Code v2.1.142 shipped on 2026-05-14, one day before this post. The changelog entry buries the change two bullets deep: fast mode now runs on Opus 4.7 by default. Every /fast you typed yesterday on 4.6 fires on 4.7 today.

That is a quiet rate change, not a rate-card one. Six times the per-token price of standard mode times the new tokenizer ratio. Fast mode on long English prompts now bills closer to seven and a half times the cost of standard 4.6, depending on prompt size.

You can opt back. Anthropic shipped a new env var alongside the default flip:

# Pin Claude Code fast mode to Opus 4.6 (not the new 4.7 default).
# Active as of Claude Code v2.1.142 (2026-05-14).
export CLAUDE_CODE_OPUS_4_6_FAST_MODE_OVERRIDE=1

# Verify:
claude --version   # expect 2.1.142 or later
claude /fast       # confirm fast mode is on
# Fast mode now runs on Opus 4.6 regardless of any other env var.

A few things to know about the flag. It takes precedence over CLAUDE_CODE_ENABLE_OPUS_4_7_FAST_MODE if both are set. Drop it in ~/.zshrc for global stickiness, or your project .env for per-repo stickiness. And confirm it still ships in your version with a quick search:

claude --help | grep OPUS_4_6

Anthropic typically keeps these flags around for six to twelve months after a default flip.

When 4.7 Actually Pays for Itself

The benchmark gains are real and they cluster in specific places. SWE-bench Pro at +10.9 points means hard, multi-file agent coding work lands more often on the first try. Terminal-Bench 2.0 at +4.0 helps long agent runs that touch many shell commands. CharXiv at +13.0 means chart and figure understanding got a real lift, useful for vision pipelines that read research PDFs or dashboards.

Stay on 4.7 when the call profile matches at least one of these:

  • Hard SWE work where SWE-bench Pro improvement (+10.9) compounds across a long task.
  • High-resolution images, charts, or PDFs where CharXiv (+13.0) shows up in your output.
  • Prompts at 128K and up where caching captures ~93 percent of the tokenizer inflation.
  • Plan-then-execute agents where one stronger plan saves five weaker build cycles.

When to Pin Back to 4.6

Three workloads regress or break even on 4.7 once you account for cost. Override fast mode back to 4.6 when:

  • Prompts sit between 2K and 25K and you do not use prompt caching. This is the worst tokenizer cost zone.
  • The job involves agentic web research. BrowseComp dropped from 83.7 to 79.3 percent, a 4.4 point regression on 4.7.
  • You run a Claude Max plan and your session caps now burn down 1.3 to 3 times faster than they did last month.

Evaluator and judgment work does not need the SWE-bench Pro lift. Linters and doc writers do not gain from CharXiv. Those agents still produce identical output on 4.6 at a real discount.

Per-Agent Model Selection

Most teams pick one model per project. The cost shift on 4.7 makes per-agent selection the better default. Different roles in an agent fleet get different value from the same intelligence step.

A planner reads a spec, decides the file layout, picks the build order. SWE-bench Pro +10.9 lands directly here. Keep planners on 4.7.

A backend agent or a designer agent writes code. The same SWE-bench Pro lift applies. Keep them on 4.7.

An evaluator agent reads code and finds gaps. Judgment does not need the new ceiling. Pin to 4.6 with the override flag and pocket the difference.

A linter, a formatter, a doc writer. Mechanical work. Pin to 4.6.

A test agent runs the test plan and reports back. Mechanical too. Pin to 4.6.

Five agents pinned to 4.6, three on 4.7, on a normal feature build. The build still gets the planner and builder lifts where they matter, and the cheap loops stay cheap.

Build This Now ships exactly this layout out of the box. The framework wires 18 specialist agents (32 with on-demand) into a .claude/ directory, each with a defined role and a configured model. Override the planner and builders to 4.7. Pin the evaluators, linters, doc writers, and testers to 4.6. The model selection lives next to the agent definition, not the project root. One framework. Many cost profiles.

The Verdict

Same rate card, different bill. Opus 4.7 prints better numbers on the benchmarks that matter for hard agent coding, and it costs more for the same English prompt almost everywhere except the very short tail. Pin 4.6 for evaluators, linters, and doc writers. Keep 4.7 for planners and SWE-heavy builders. Set the override flag once, forget about it, and let each agent role buy what it actually needs.

More in Model Picker

  • Claude Mythos: O Modelo que Pensa em Ciclos
    Suspeita-se que o Claude Mythos use arquitetura de profundidade recorrente: uma camada partilhada em loop N vezes, com halting ACT para que perguntas difíceis recebam mais passagens e as fáceis parem cedo.
  • Claude Opus 4.7 vs Outros Modelos de IA
    Claude Opus 4.7, GPT-5.4, Kimi K2.6, Gemini 3.1 Pro, DeepSeek V3.2: benchmarks, janelas de contexto, fiabilidade em agentes e custos, para escolheres o modelo certo para cada trabalho.
  • DeepSeek V4: Pricing, Context, and Migration
    DeepSeek V4 ships two models: V4-Flash at $0.28/M output and V4-Pro at $3.48/M. Both carry a genuine 1M context window and drop into any Anthropic-compatible SDK with one line changed.
  • Todos os Modelos Claude
    Todos os modelos Claude numa só página: Claude 3, 3.5, 3.7, 4, Opus 4.1 a 4.6, Sonnet 4.5 e 4.6, Haiku 4.5. Especificações, preços, benchmarks e quando usar cada um.
  • Claude 3.5 Sonnet v2 e Claude 3.5 Haiku
    Claude 3.5 Sonnet v2 e 3.5 Haiku lançados em outubro de 2024 com Computer Use beta, controlo de cursor, programação e uso de ferramentas melhorados, e Haiku mais barato a $0.80/$4.
  • Claude 3.5 Sonnet
    Claude 3.5 Sonnet lançado em junho de 2024 a $3/$15, superando Claude 3 Opus no MMLU, GPQA e HumanEval a um quinto do custo. Especificações, benchmarks e ganhos em programação.

Pare de configurar. Comece a construir.

Templates SaaS com orquestração de IA.

Claude Opus 4.7

Claude Opus 4.7, o flagship da Anthropic de abril de 2026: programação difícil mais forte, raciocínio sobre documentos, tarefas de agentes de longa duração, mesmo preço de $5/$25 que o Opus 4.6.

Casos de Uso do Claude Opus 4.7

Casos de uso do Claude Opus 4.7 em engenharia multi-arquivo, revisão de segurança, jurídico, finanças, análise de documentos, visão computacional e agentes de longa duração no Claude Code.

On this page

The Side-by-Side
What a Real Call Costs Now
The Default Flip Nobody Warned You About
When 4.7 Actually Pays for Itself
When to Pin Back to 4.6
Per-Agent Model Selection
The Verdict

Pare de configurar. Comece a construir.

Templates SaaS com orquestração de IA.