Build This Now
Build This Now
クロード・コードとは何か?Claude Code のインストールClaude Code ネイティブインストーラーClaude Code で最初のプロジェクトを作る
DESIGN.md: AIのUI一貫性問題を解決するClaude Buddy/powerupClaude Codeソースマップ流出事件Claude CodeのForkサブエージェント完全ガイドKimi K2.6: 何が変わったのかDid Anthropic Call for an AI Pause? What It Actually SaidクロードコードのオートメモリークロードコードのオートメモリークロードコードのオートメモリークロードコードのオートメモリーClaude Code Costs After June 15: What Actually ChangedCompound Engineering: The AI Loop Where Every Task Makes the Next EasierGitHub Spec Kit: Spec-Driven Development That Kills Vibe CodingSWE-bench Is Lying: How DeepSWE Caught AI Agents CheatingVibe Coding's 90-Day Reckoning: The Technical Debt Nobody Warns You About
speedy_devvkoen_salo
Blog/Handbook/Core/Claude Code Costs After June 15: What Actually Changed

Claude Code Costs After June 15: What Actually Changed

The June 15, 2026 Claude Code billing split was paused before it shipped. Here is what really changed, what it costs now, and the cost-cutting levers that are still live.

設定をやめて、構築を始めよう。

AIオーケストレーション付きSaaSビルダーテンプレート。

Published Jun 16, 2026Updated Jun 16, 202610 min readHandbook hubCore index

The Claude Code billing split planned for June 15, 2026 was paused before it shipped. Anthropic told customers "Nothing changes for now," and programmatic usage — the Claude Agent SDK, claude -p, GitHub Actions, and third-party apps — still draws from your normal subscription limits, not a separate per-user credit pool. The only thing that actually landed on June 15 was the retirement of two old model IDs (claude-opus-4-20250514 and claude-sonnet-4-20250514), which now return errors. So if you use Claude Code on a Pro or Max plan, your cost did not change. The real cost levers — model choice, prompt caching, /compact, and batching — are unchanged and fully in effect.

This post covers what changed and what didn't, the two-pools model people expected, a Max-plan-vs-API break-even table, and a numbered checklist for cutting token costs.

Table of Contents

  1. What Actually Changed on June 15
  2. The Two Pools: Subscription vs Agent SDK Credit
  3. What Claude Code Costs Right Now
  4. Max Plan vs API: The Break-Even
  5. The Cost-Cutting Checklist
  6. The One Thing You Must Fix: Retired Models
  7. Frequently Asked Questions
  8. Wrapping Up

設定をやめて、構築を始めよう。

AIオーケストレーション付きSaaSビルダーテンプレート。


What Actually Changed on June 15

Three changes were bundled in the conversation around June 15, 2026. Only two landed, and one of those happened eight months earlier. Separating them is the whole point of this post.

ChangeStatus as of June 16, 2026Source
Billing split — programmatic usage moves to a separate per-user monthly creditPAUSED. "Nothing changes for now." Agent SDK, claude -p, GitHub Actions, and third-party apps still draw from subscription limits.Anthropic Help Center
claude-opus-4-20250514 retiredDONE. Retirement date June 15, 2026 (deprecated April 14). Calls now error. Replacement: claude-opus-4-8.model deprecations
claude-sonnet-4-20250514 retiredDONE. Retirement date June 15, 2026. Calls now error. Replacement: claude-sonnet-4-6.model deprecations
SDK package rename (claude-code-sdk → claude-agent-sdk)ALREADY DONE — September 29, 2025, not June 15. Unrelated to the billing date.Agent SDK migration guide

The billing split was the change everyone braced for. Anthropic told the press by email that it was "working to better align the plan with actual usage patterns" and that, for now, "Nothing changes." The pause arrived right before the implementation date, against the backdrop of a price war with OpenAI.

So the bundling that made June 15 feel like a hard deadline was misleading. The model retirements are real and require action. The SDK rename predates this entirely. The billing split is not in effect. For the full mechanics of what the split would have done — the migration steps, the opt-in flow — see the canonical billing-change post, which we have annotated with the pause.

The Two Pools: Subscription vs Agent SDK Credit

Here is the model that was proposed, and what is actually true today. Read this table as "what people thought June 15 would bring" versus reality.

Interactive subscription (live now)Agent SDK credit pool (PROPOSED — paused)
What countsTerminal and IDE sessions, plus — for now — Agent SDK, claude -p, GitHub Actions, third-party appsWas going to be: Agent SDK, claude -p, GitHub Actions, third-party apps only
How billedFlat subscription ($20 Pro / $100 Max 5x / $200 Max 20x per month), usage capped by plan limitsWas going to be: per-user monthly credit, drawn down by API-equivalent token spend
Pooling / sharingN/A — your plan, your limitsWas going to be: per-user, no pooling, no sharing across seats
RolloverN/AWas going to be: none — use it or lose it each month
Opt-inN/AWas going to be: one-time opt-in
In effect today?YesNo

The proposed per-user credit amounts were documented: $20 for Pro, $100 for Max 5x, $200 for Max 20x, plus per-seat figures for Team and Enterprise. None of these are charges you pay today. Treat them only as "what may return," because Anthropic said the split is paused "for now," not cancelled.

The practical takeaway: if you run an overnight Agent SDK job, a claude -p script in CI, or a GitHub Action that calls Claude, all of that still counts against your existing subscription limits exactly as it did before June 15. Nothing got carved out into a separate bucket.

What Claude Code Costs Right Now

Two ways to pay for Claude Code, unchanged by June 15:

1. Subscription (terminal and IDE). A flat monthly fee. Pro is $20/month, Max is $100 or $200/month. Usage is bounded by plan limits, not metered per token. This is the cheapest path for steady interactive work because you are not billed by the token at all.

2. API (pay-as-you-go). Per-token pricing on the model you call. The current rates, verified against the pricing page:

ModelInput ($/MTok)Output ($/MTok)Context
Claude Opus 4.8$5$251M
Claude Sonnet 4.6$3$151M
Claude Haiku 4.5$1$5200K

Both Opus 4.8 and Sonnet 4.6 serve a 1M-token context window at standard pricing with no long-context premium. Opus 4.8 caps output at 128K tokens; Sonnet 4.6 at 64K. For how the million-token window behaves in real sessions, see Claude Code's 1M context in practice.

On top of the base rates, two discounts move the needle hard, both from the pricing page:

  • Prompt caching. Cache reads cost 0.1x the base input price. A 5-minute cache write costs 1.25x; a 1-hour cache write costs 2.0x. Break-even is one read for the 5-minute cache, two reads for the 1-hour cache.
  • Batch API. A flat 50% off both input and output tokens for non-urgent work.

Max Plan vs API: The Break-Even

When is a flat Max subscription cheaper than paying per token on the API? It comes down to how much output you generate per month. The math below uses the verified rates above and assumes a typical agentic mix that is output-heavy (output tokens dominate cost on Claude models because output is 5x the input rate).

Monthly usage profileAPI cost estimate (Sonnet 4.6)API cost estimate (Opus 4.8)Cheaper option
Light: ~3M output tokens~$45 + input~$75 + inputAPI on Sonnet; Max 5x on Opus
Moderate: ~7M output tokens~$105 + input~$175 + inputMax 5x ($100) on both
Heavy: ~15M output tokens~$225 + input~$375 + inputMax 20x ($200) wins decisively
Bursty / unattended agentsVariable, hard to capVariable, hard to capAPI + Batch (50% off) for non-urgent runs

How to read this:

  • If your output is under ~6-7M tokens/month on Sonnet, the API can come out cheaper than even Max 5x — you only pay for what you use. The crossover point on Opus is lower because Opus output is $25/MTok.
  • If you run Claude Code interactively most of the day, a Max subscription almost always wins. You hit a flat ceiling instead of an open meter, and heavy interactive use blows past the API break-even fast.
  • If you run unattended agents on a schedule, the API plus the Batch API's 50% discount is usually the right tool for the non-urgent portion, because you can bound spend per run instead of consuming subscription limits unpredictably.

The numbers are estimates — your input-to-output ratio, caching hit rate, and model mix all shift the line. The point is the shape: flat subscription wins for steady interactive work; metered API wins for light or burst-controllable work. For a deeper treatment of the trade-off, see Claude Code Max plan vs API.

The Cost-Cutting Checklist

These levers are all live today. Work through them in order — the early ones have the biggest payoff for the least effort.

  1. Pick the cheapest model that does the job. Sonnet 4.6 is $3/$15 versus Opus 4.8 at $5/$25 — 40% cheaper on input, 40% cheaper on output. Reserve Opus for the hard reasoning steps and let Sonnet handle scanning, extraction, and routine edits. In Claude Code, run cheaper models for fan-out work and keep Opus for synthesis.

  2. Turn on prompt caching and keep your prefix stable. Cache reads are 0.1x base input. The catch is the prefix-match rule: any byte change anywhere in the cached prefix invalidates everything after it. Freeze your system prompt — no Date.now(), no per-request IDs, no unsorted JSON early in the prompt. Put volatile content last.

  3. Avoid the 5-minute cache cliff. The default cache TTL expires after 5 minutes of inactivity. Any call that fires more than 300 seconds after the previous one misses the cache entirely and pays a full re-write. For bursty traffic with gaps, use the 1-hour TTL (ttl: '1h') at 2.0x the write cost — it pays off after two reads and keeps long, gappy workflows cheap.

  4. Run /compact and keep context lean. Long sessions accumulate stale tool output and completed reasoning. Compaction summarizes earlier context so you stop re-sending tokens you no longer need. A lean context window is a cheaper context window on every subsequent turn.

  5. Batch non-urgent work. The Batch API is 50% off both input and output. Anything that does not need a real-time answer — overnight test generation, a documentation sweep, a backlog of classification — belongs in a batch.

  6. Set adaptive thinking and tune effort, not a token budget. On Opus 4.8 and Sonnet 4.6, use adaptive thinking plus output_config.effort (low through max). Lower effort means fewer, more-consolidated tool calls and less preamble. Note that budget_tokens, temperature, top_p, and top_k are removed on Opus 4.8 and return a 400 — do not reach for them.

  7. Instrument your token spend now. Even though the credit split is paused, it may return. Track cache_read_input_tokens, cache_creation_input_tokens, and output tokens per session so you already know your real usage profile if the per-user pool ever lands. This is also how you verify caching is working — if cache_read_input_tokens is zero across repeated requests, a silent invalidator is in your prefix.

For more on running Claude Code without a live operator — where unattended cost control matters most — see Claude Code headless mode. For a deeper, single-topic treatment of token reduction, see how to cut Claude Code token costs.

The One Thing You Must Fix: Retired Models

The only June 15 change that requires action is the model retirement. claude-opus-4-20250514 and claude-sonnet-4-20250514 were both deprecated April 14, 2026 and retired June 15. Requests to either now fail on Anthropic-operated platforms (Claude API, Claude Platform on AWS, Microsoft Foundry). Partner platforms — Amazon Bedrock and Google Vertex AI — set their own schedules, so the dated Sonnet 4 ID may still resolve there for a while.

The drop-in replacements, per Anthropic's own deprecation page, are the bare aliases:

- model: "claude-opus-4-20250514"
+ model: "claude-opus-4-8"

- model: "claude-sonnet-4-20250514"
+ model: "claude-sonnet-4-6"

Use the alias claude-sonnet-4-6, not a dated variant. Two things to know when you swap:

  • Adaptive thinking only. thinking: {type: "enabled", budget_tokens: N} returns a 400 on the new models. Use thinking: {type: "adaptive"} and control depth with output_config.effort instead.
  • No sampling parameters. temperature, top_p, and top_k are removed on Opus 4.8 and return a 400. Steer with prompting.

If you were also relying on the inherited Claude Code system prompt in the Python Agent SDK, you now set it explicitly via systemPrompt={'type':'preset','preset':'claude_code'} in ClaudeAgentOptions — a consequence of the September 2025 SDK rename, not the June billing date.

Frequently Asked Questions

Did Claude Code's billing change on June 15, 2026?

No. The planned split that would have moved Agent SDK, claude -p, GitHub Actions, and third-party apps to a separate monthly credit pool was paused just before June 15. Anthropic's Help Center now states those surfaces still draw from your normal subscription limits, and the company told press "Nothing changes for now."

So what actually changed on June 15, 2026?

Two old model IDs retired: claude-opus-4-20250514 and claude-sonnet-4-20250514. API calls to either now return errors. Replace them with claude-opus-4-8 and claude-sonnet-4-6. That is the only thing developers had to act on.

Do I still need to rename claude-code-sdk to claude-agent-sdk?

If you have not already, yes — but that rename happened back on September 29, 2025 with Agent SDK v0.1.0, not on June 15, 2026. The package is claude-agent-sdk (Python) and @anthropic-ai/claude-agent-sdk (TypeScript), and ClaudeCodeOptions became ClaudeAgentOptions. It is unrelated to the paused billing change.

What does Claude Code actually cost now?

If you use the terminal or IDE on a Pro ($20/month) or Max ($100 or $200/month) subscription, the same flat subscription as before — the credit pool never went live. If you call the API directly, you pay per token: Sonnet 4.6 at $3/$15 per million in/out, Opus 4.8 at $5/$25.

How do I cut Claude Code and Agent SDK token costs?

Pick the cheapest model that does the job (Sonnet 4.6 is 40% cheaper than Opus 4.8 on both input and output), turn on prompt caching (cache reads are 0.1x base input), run /compact to keep context lean, batch non-urgent work for 50% off, and avoid the 5-minute cache TTL cliff by using the 1-hour TTL for bursty traffic.

Will the Claude Code billing split come back?

Possibly. Anthropic only said it is paused "for now" while it works to better align the plan with actual usage patterns, against the backdrop of an OpenAI price war. The proposed credit amounts ($20 Pro / $100 Max 5x / $200 Max 20x, per-user, no rollover) are documented, so it is worth instrumenting your token spend now in case the split returns.

Wrapping Up

The headline that mattered for June 15 was the one that didn't happen. The billing split is paused, programmatic usage still runs on your subscription limits, and the SDK rename was old news from September. The only thing on your to-do list is swapping the two retired model IDs to claude-opus-4-8 and claude-sonnet-4-6, and remembering that the new models drop budget_tokens and sampling parameters.

Everything that actually controls your bill — model choice, caching, compaction, batching — is unchanged and live. Tune those, instrument your token spend, and you are covered whether or not the credit pool ever ships.


Posted by @speedy_devv

Continue in Core

  • Claude Codeにおける100万トークンコンテキストウィンドウ
    AnthropicはClaude CodeのOpus 4.6とSonnet 4.6に対して100万トークンのコンテキストウィンドウを有効化した。ベータヘッダー不要、追加料金なし、定額料金、そして圧縮の削減。
  • AGENTS.md vs CLAUDE.md 解説
    2つのコンテキストファイル、1つのコードベース。AGENTS.mdとCLAUDE.mdの違い、それぞれが何をするか、重複なしに両方を使う方法を解説します。
  • Why a Hidden Line of Text Can Hijack Your AI Browser
    AI browsers read the whole web page — including text hidden from you. That's the door behind prompt injection, OWASP's #1 AI security risk in 2026. Here's how the attack works, in plain English.
  • AI Research for Builders: The Latest Breakthroughs, Explained Monthly
    A monthly digest of the latest AI research — agents, reasoning, efficiency, and models — with every claim traced to its source and translated into what it means if you build with AI.
  • 10 AI Research Breakthroughs That Matter for Builders (June 2026)
    The latest AI research, explained: AI disproved an 80-year-old math conjecture, agents got cheaper and more reliable, and inference costs dropped up to 100x. What each finding means if you build with AI.
  • Did Anthropic Call for an AI Pause? What It Actually Said
    Anthropic did not call to halt the AI boom. Here is what its June 2026 'recursive self-improvement' post actually said, why the 80%-of-its-own-code stat spooked it, and what it means if you build with Claude Code.

More from Handbook

  • エージェント型コマース:AI エージェントが支払えるアプリの作り方
    2026年のエージェント型コマースをわかりやすく解説するガイド。x402、ACP、Machine Payments Protocol が何をするのか、そして AI エージェントが購入できる有料 API を週末で出荷するための手順を紹介します。
  • Claude Code ベストプラクティス
    Claude Codeで成果を出すエンジニアを分ける5つの習慣: PRD、モジュラーなCLAUDE.mdのルール、カスタムスラッシュコマンド、/clearリセット、そしてシステム進化の思考法。
  • Claude Code オートモード
    2つ目の Sonnet モデルが、Claude Code のすべてのツール呼び出しを実行前に審査します。オートモードがブロックするもの・許可するもの、そして settings.json に追加される許可ルールについて解説します。
  • Channels、Routines、Teleport、Dispatch
    Anthropic が2026年3月と4月に出荷した4つの Claude Code 機能。これらは CLI を、スマホ・ウェブ・デスクトップをまたぐイベント駆動の調整レイヤーに変えます。

設定をやめて、構築を始めよう。

AIオーケストレーション付きSaaSビルダーテンプレート。

クロードコードのオートメモリー

オートメモリーは、Claude Codeがプロジェクトノートを実行し続けることを可能にします。ファイルの場所、書き込まれる内容、/memoryの切り替え方法、CLAUDE.mdを選ぶタイミング。

Compound Engineering: The AI Loop Where Every Task Makes the Next Easier

Compound engineering is an AI coding loop (plan, build, review, compound) where every fix becomes a permanent lesson. Here is the method and how to set it up in Claude Code.

On this page

Table of Contents
What Actually Changed on June 15
The Two Pools: Subscription vs Agent SDK Credit
What Claude Code Costs Right Now
Max Plan vs API: The Break-Even
The Cost-Cutting Checklist
The One Thing You Must Fix: Retired Models
Frequently Asked Questions
Did Claude Code's billing change on June 15, 2026?
So what actually changed on June 15, 2026?
Do I still need to rename claude-code-sdk to claude-agent-sdk?
What does Claude Code actually cost now?
How do I cut Claude Code and Agent SDK token costs?
Will the Claude Code billing split come back?
Wrapping Up

設定をやめて、構築を始めよう。

AIオーケストレーション付きSaaSビルダーテンプレート。