Build This Now
Build This Now
Claude Code ModelsClaude Fable 5 CheatsheetClaude Fable 5 vs Opus 4.8Claude Fable 5 Use CasesClaude Fable 5 Pricing & Cost ControlClaude Fable 5 API GuideClaude Fable 5 in Claude CodeClaude Fable 5 Safeguards ExplainedOpus 4.8 CheatsheetDeepSeek V4: Pricing, Context, and MigrationClaude Code Quality Regression: What Actually HappenedClaude Opus 4.7 vs GPT-5.5Claude Opus 4.7 vs Other AI ModelsClaude Mythos: The Model That Thinks in LoopsClaude Opus 4.5 in Claude CodeClaude Opus 4.7Claude Opus 4.7 vs 4.6Claude Opus 4.7 Use CasesClaude Opus 4.6Claude Sonnet 4.6Claude Opus 4.5Claude Sonnet 4.5Claude Haiku 4.5Claude Opus 4.1Claude 4Claude 3.7 SonnetClaude 3.5 Sonnet v2 and Claude 3.5 HaikuClaude 3.5 SonnetClaude 3Every Claude ModelBest AI Model for Coding in 2026 (Tested & Ranked)
speedy_devvkoen_salo
Blog/Model Picker/Claude Fable 5 Pricing & Cost Control

Claude Fable 5 Pricing & Cost Control

Claude Fable 5 costs $10 per million input tokens and $50 per million output tokens, exactly double Opus 4.8. Here is the cost math, the fallback pricing quirk, and the five levers that keep the bill down: effort, task budgets, caching, batch, and routing.

Stop configuring. Start building.

SaaS builder templates with AI orchestration.

Published Jun 10, 202612 min readModel Picker hub

Claude Fable 5 costs $10 per million input tokens and $50 per million output tokens. That is exactly double Opus 4.8 ($5/$25), and less than half the price of Mythos Preview (~$30/$150), the restricted model it descends from.

A representative 100K-in/20K-out task costs $2.00 on Fable 5 versus $1.00 on Opus 4.8. The sticker is 2x, but your actual bill depends on five levers you control: reasoning effort, task budgets, prompt caching, the batch API, and which traffic you route to Fable at all.

Fable 5 is the first publicly available Mythos-class model, a tier above Opus. The price reflects the tier, and it arrives at a moment when enterprises are increasingly critical of AI costs. TechCrunch noted the $10/$50 rate "alone might serve as a deterrent for widespread use." This post is the math and the playbook for keeping it under control.

The Rate Card

Every line of Fable 5's pricing is precisely double Opus 4.8.

Token typeClaude Fable 5Claude Opus 4.8
Input$10 / 1M$5 / 1M
Output$50 / 1M$25 / 1M
Batch API input$5 / 1M$2.50 / 1M
Batch API output$25 / 1M$12.50 / 1M
5-min cache write$12.50 / 1M$6.25 / 1M
1-hour cache write$20 / 1M$10 / 1M
Cache hits & refreshes$1 / 1M$0.50 / 1M

One framing worth keeping in mind: Fable 5's standard $10/$50 is the same per-token rate as Opus 4.8's fast mode. You are paying Opus-fast-mode prices for a model that sits a full tier higher. Whether that is a deal depends entirely on the task, which is what the rest of this comes down to.

What a Task Actually Costs

Start with the base case so the sticker is concrete. Take a 100K-in/20K-out call.

On Fable 5:

input:  100,000 tokens × $10/1M = $1.00
output:  20,000 tokens × $50/1M = $1.00
total                           = $2.00

On Opus 4.8:

input:  100,000 tokens × $5/1M  = $0.50
output:  20,000 tokens × $25/1M = $0.50
total                           = $1.00

Exactly 2x at identical token usage. A smaller 50K-in/10K-out coding call is $1.00 on Fable versus $0.50 on Opus. The ratio never changes on the rate card. What changes is everything around it.

The case that hurts is long context. A near-1M-token prompt at $10 per million input is roughly a $9 input bill before Fable writes a single useful token:

input:  900,000 tokens × $10/1M =  $9.00
output:   5,000 tokens × $50/1M =  $0.25
total                           =  $9.25 per call

Run that uncached across a workflow and the bill compounds fast. Which is the first lever.

Lever 1: Caching, the 10x Discount on Repeated Context

Cache hits on Fable 5 cost $1 per million tokens, versus $10 per million for fresh input. That is a 10x reduction on any context you reuse.

Take the $9.25 long-context call above and assume the 900K of context is a cache hit:

cached input:  900,000 tokens × $1/1M  =  $0.90
output:          5,000 tokens × $50/1M =  $0.25
total                                  =  $1.15 per call

From $9.25 to $1.15. If your agent reads the same large repo, spec, or document set across many calls, caching is the single biggest cost lever you have. The cache write costs a premium once ($12.50/1M for the 5-minute tier, $20/1M for the 1-hour tier), then every hit is cheap.

Lever 2: Effort, the Soft Dial

Effort is, in Anthropic's words, "the primary control for the trade-off between intelligence, latency, and cost on Claude Fable 5." It sets how many thinking tokens the model spends, and thinking tokens bill as output at $50 per million.

The levels are low, medium, high (the default), and xhigh. The guidance is to use high for most tasks, xhigh for the most capability-sensitive work, and medium or low for routine jobs. The key insight for cost: lower effort settings on Fable 5 "still perform well and often exceed xhigh performance on prior models."

Read that twice. Fable 5 at medium effort often beats Opus 4.8 at its top effort. Anthropic's FrontierCode result backs it up, where Fable leads frontier models even at medium effort. So the cost lever is frequently not "switch to a cheaper model." It is "turn Fable's effort down."

Picture a hard task that burns roughly 40K thinking tokens at xhigh and 12K at medium (illustrative; Anthropic does not publish exact per-effort counts). At $50 per million, that is about $2.00 of thinking versus $0.60 on the same job. Anthropic's own advice is direct: "Reduce effort if a task completes but takes longer than necessary." Note that adaptive thinking is always on, so you can shrink the depth but you cannot turn thinking off.

Lever 3: Task Budgets, the Hard Cap

Effort is a dial. Task budgets are a wall.

The task-budgets beta (header task-budgets-2026-03-13, minimum 20,000 tokens) lets you cap the total tokens an agentic loop can consume. Where effort nudges spend down on average, a task budget guarantees a single autonomous run cannot blow past a ceiling you set.

This matters more on Fable 5 than on any prior model, because its turns run long by design. Individual hard requests can run for minutes at higher effort, and autonomous runs can extend for hours. One launch-day user reported Fable 5 "eating my Max 20x plan at ~2% per minute." A task budget is how you make sure a runaway loop stops at a number you chose instead of a number the model chose.

Lever 4: Batch API, Half Price for Offline Work

Anything that does not need to happen in real time should go through the batch API. It cuts the rate in half: $5/$25 instead of $10/$50. For overnight evals, bulk document processing, and offline pipelines, that is a flat 50% off the most expensive part of your bill.

Lever 5: Routing, Only the Hard Tail Goes to Fable

The biggest lever is also the simplest. Most traffic does not need a Mythos-class model. Default routine work to Opus 4.8 or Sonnet 4.6 and send only the hard, long-horizon, failure-prone tail to Fable 5.

At enterprise scale the stakes are real. On pure routine output, billing analysts have modeled 5 billion output tokens a year at roughly $250,000 on Fable 5 versus $125,000 on Opus 4.8. For classification, summarization, and RAG retrieval, that delta buys nothing, because Opus already clears the quality bar. Promote a task to Fable only when a cheaper model demonstrably fails, loses the plan mid-task, or burns more total tokens through retries.

The Fallback Pricing Quirk

This is the part of Fable 5's pricing with no equivalent anywhere else, and it is worth understanding exactly.

Fable 5 runs safety classifiers for cybersecurity, biology and chemistry, and distillation. When one trips, the request is handled by Opus 4.8 instead, and you are billed at Opus rates. Anthropic says this fires in under 5% of sessions. Two billing cases follow, straight from the AWS launch documentation:

A whole request routed to Opus 4.8. If the classifier trips at the start, the entire response comes from Opus 4.8 and bills entirely at Opus prices ($5/$25). You are not charged Fable rates at all.

A request blocked mid-conversation. If the block happens partway through, the initial tokens (processed by Fable before the block) bill at Fable rates ($10/$50), and the subsequent tokens (the Opus response) bill at Opus rates ($5/$25). A single request, split across two rate cards.

Practically, this means bio, chem, or security-adjacent workloads get a quiet, partial discount whenever the classifier reroutes them. For those domains the fallback rate is higher than the 5% average, because the classifiers are deliberately broad. It is unpredictable, which is its own reason to route that traffic to Opus by choice rather than discover it on the invoice.

One implementation note: refusals can arrive as a successful HTTP 200 with stop_reason: "refusal". Production code must check the stop reason instead of assuming every 200 is a billed Fable answer, and API customers configure server-side or client-side fallback to Opus 4.8 explicitly. It is not automatic the way it is in the Claude apps.

When the 2x Sticker Lies

The rate card says double. On the right task, your effective cost lands below the cheaper model.

A frontier physics lab reported Fable 5 was its strongest model "while using a third of the reasoning tokens," reaching in 36 hours nearly where GPT-5.5 landed after four days. The math is unforgiving in Fable's favor: one-third the tokens at twice the per-token price is two-thirds the effective cost. On that class of long, deliberate work, Fable 5 is cheaper, not pricier.

The same pattern shows up elsewhere. A spreadsheet suite found Fable 5 beats Opus 4.8 at every effort level with fewer turns, finishing 25 to 30% faster. Base44 said apps that "took a hundred prompts a year ago, it now one-shots." Rakuten: "the extra thinking pays for itself." And Stripe ran a migration on a 50-million-line Ruby codebase in one day that was estimated at over two months of team effort, where the token bill is trivial against the salary it replaced.

Anthropic's Dianne Penn framed it for CNBC: customers want higher accuracy and benefit per dollar, early customers "noted an improvement in spend per task," and "you just get a higher ROI by having more intelligent models." The number to optimize is cost per completed task, not cost per token.

Lever 6: The Free Window Before June 22

There is a time-boxed lever that closes fast. Fable 5's subscription rollout is staged:

  • From June 9 through June 22, Fable 5 is included on Pro, Max, Team, and seat-based Enterprise plans at no extra cost.
  • On June 23, Fable 5 leaves those plans. Using it after that requires usage credits. Anthropic says it may extend the window if capacity allows.
  • Eventually, Anthropic aims to restore Fable 5 as a standard part of subscription plans, with no committed date.

On the API and consumption-based Enterprise plans, Fable 5 is fully metered at $10/$50 from day one. But if you are on a subscription, the window through June 22 is a free evaluation period. Use it to run your real tasks on Fable 5, measure spend per completed task against Opus 4.8, and decide whether it earns a place on usage credits after the 23rd. After that, casual use becomes credit-metered, so the time to benchmark is now.

The Cost-Control Playbook

Put the levers together and the strategy is short:

  1. Route by task. Default routine traffic to Opus 4.8 or Sonnet 4.6; reserve Fable 5 for the hard, long-horizon tail.
  2. Turn effort down before switching models. Medium effort on Fable often beats Opus at its top effort, at a fraction of the thinking-token spend.
  3. Cache aggressively. Cache hits are $1/1M versus $10/1M fresh, a 10x lever on repeated context.
  4. Cap loops with task budgets. The beta header caps an agentic run (minimum 20,000 tokens) so it cannot run away.
  5. Batch offline work. Half price at $5/$25 for anything that does not need real time.
  6. Benchmark in the free window. Evaluate on real tasks before June 22, while it is free on subscription plans.

The Verdict

Fable 5's pricing is simple to state and easy to misread. The sticker is exactly 2x Opus 4.8, but the sticker is the wrong number. Effort, task budgets, caching, batch, routing, and Fable's own token efficiency all bend the real bill, sometimes below the cheaper model on hard tasks and well above it on routine ones.

Spend your attention on the levers, not the rate card. Route the easy work to cheaper models, send only the hard tail to Fable, tune effort and cap budgets, and use the free window to learn your own spend-per-task before the meter starts on June 23.

Frequently Asked Questions

How much does Claude Fable 5 cost?

Claude Fable 5 costs $10 per million input tokens and $50 per million output tokens, exactly double Opus 4.8 ($5/$25). A 100K-in/20K-out task runs $2.00 on Fable 5 versus $1.00 on Opus 4.8. The batch API halves the rate to $5/$25, and cache hits drop input to $1 per million.

Why is Claude Fable 5 twice the price of Opus 4.8?

Fable 5 is the first publicly available Mythos-class model, a capability tier above the Opus class, and the price reflects that tier. It is still less than half the price of Mythos Preview (~$30/$150), the restricted model it descends from. Anthropic argues the higher ROI per completed task can offset the per-token premium on hard work.

How does Claude Fable 5 fallback pricing work?

When Fable 5's safety classifiers route a request to Opus 4.8, you pay Opus rates, not Fable rates. If a request is blocked mid-conversation, the initial tokens bill at Fable rates and the subsequent tokens bill at Opus rates. This fallback fires in under 5% of sessions on typical workloads.

How do I control costs on Claude Fable 5?

Use five levers: lower the reasoning effort (medium often beats Opus at top effort), cap agentic loops with task budgets (minimum 20,000 tokens), cache repeated context (cache hits are $1/1M versus $10/1M), use the batch API for offline work (half price), and route only hard tasks to Fable while keeping routine work on Opus 4.8 or Sonnet 4.6.

Is Claude Fable 5 free right now?

On Pro, Max, Team, and seat-based Enterprise subscription plans, Fable 5 is included at no extra cost from June 9 through June 22, 2026. On June 23 it leaves those plans and requires usage credits. On the API and consumption-based Enterprise plans, it is metered at $10/$50 from day one.

Can token efficiency make Claude Fable 5 cheaper than Opus 4.8?

On the right task, yes. A physics lab reported Fable 5 using a third of the reasoning tokens of a rival model, which at 2x the per-token price works out to two-thirds the effective cost. Fewer turns and higher first-try success rates reduce spend per completed task, even though the rate card is double.

Sources

  • Claude Fable 5 and Claude Mythos 5
  • Claude Fable 5 on AWS (AWS News Blog)
  • Anthropic's Claude Fable 5 is a version of Mythos the public can access today (TechCrunch)
  • Anthropic releases Mythos-like AI model to the public (CNBC)
  • Prompting Claude Fable 5 (API docs)
  • Claude Fable 5 and Mythos 5 pricing and benchmarks (Finout)
  • Claude Fable 5 vs Opus 4.8 (TrueFoundry)

Related Pages

  • Claude Fable 5 vs Opus 4.8
  • Opus 4.8 Cheatsheet
  • Claude Opus 4.7 vs 4.6
  • Claude Code Models

More in Model Picker

  • Claude Mythos: The Model That Thinks in Loops
    Claude Mythos is suspected to use recurrent-depth architecture: one shared layer looped N times, with ACT halting so hard questions get more passes and easy ones stop early.
  • Claude Opus 4.7 vs Other AI Models
    Claude Opus 4.7, GPT-5.4, Kimi K2.6, Gemini 3.1 Pro, DeepSeek V3.2: benchmarks, context windows, agent reliability, and cost, so you reach for the right one.
  • DeepSeek V4: Pricing, Context, and Migration
    DeepSeek V4 ships two models: V4-Flash at $0.28/M output and V4-Pro at $3.48/M. Both carry a genuine 1M context window and drop into any Anthropic-compatible SDK with one line changed.
  • Every Claude Model
    Every Claude model on one page: Claude 3, 3.5, 3.7, 4, Opus 4.1 to 4.6, Sonnet 4.5 and 4.6, Haiku 4.5. Specs, pricing, benchmarks, and when to use each.
  • Best AI Model for Coding in 2026 (Tested & Ranked)
    The best AI model for coding in 2026, ranked by use case and budget: Claude Opus 4.8 for hardest agentic work, GPT-5.5 for terminal agents, DeepSeek V4 for value, with cited benchmarks.
  • Claude 3.5 Sonnet v2 and Claude 3.5 Haiku
    Claude 3.5 Sonnet v2 and 3.5 Haiku launched October 2024 with Computer Use beta, cursor control, upgraded coding and tool use, and cheaper Haiku at $0.80/$4.

Stop configuring. Start building.

SaaS builder templates with AI orchestration.

On this page

The Rate Card
What a Task Actually Costs
Lever 1: Caching, the 10x Discount on Repeated Context
Lever 2: Effort, the Soft Dial
Lever 3: Task Budgets, the Hard Cap
Lever 4: Batch API, Half Price for Offline Work
Lever 5: Routing, Only the Hard Tail Goes to Fable
The Fallback Pricing Quirk
When the 2x Sticker Lies
Lever 6: The Free Window Before June 22
The Cost-Control Playbook
The Verdict
Frequently Asked Questions
How much does Claude Fable 5 cost?
Why is Claude Fable 5 twice the price of Opus 4.8?
How does Claude Fable 5 fallback pricing work?
How do I control costs on Claude Fable 5?
Is Claude Fable 5 free right now?
Can token efficiency make Claude Fable 5 cheaper than Opus 4.8?
Sources
Related Pages

Stop configuring. Start building.

SaaS builder templates with AI orchestration.