Claude Fable 5 Pricing & Cost Control

Claude Fable 5 costs $10 per million input tokens and $50 per million output tokens. That is exactly double Opus 4.8 ($5/$25), and less than half the price of Mythos Preview (~$30/$150), the restricted model it descends from.

A representative 100K-in/20K-out task costs $2.00 on Fable 5 versus $1.00 on Opus 4.8. The sticker is 2x, but your actual bill depends on five levers you control: reasoning effort, task budgets, prompt caching, the batch API, and which traffic you route to Fable at all.

Fable 5 is the first publicly available Mythos-class model, a tier above Opus. The price reflects the tier, and it arrives at a moment when enterprises are increasingly critical of AI costs. TechCrunch noted the $10/$50 rate "alone might serve as a deterrent for widespread use." This post is the math and the playbook for keeping it under control.

The Rate Card

Every line of Fable 5's pricing is precisely double Opus 4.8.

Token type	Claude Fable 5	Claude Opus 4.8
Input	$10 / 1M	$5 / 1M
Output	$50 / 1M	$25 / 1M
Batch API input	$5 / 1M	$2.50 / 1M
Batch API output	$25 / 1M	$12.50 / 1M
5-min cache write	$12.50 / 1M	$6.25 / 1M
1-hour cache write	$20 / 1M	$10 / 1M
Cache hits & refreshes	$1 / 1M	$0.50 / 1M

One framing worth keeping in mind: Fable 5's standard $10/$50 is the same per-token rate as Opus 4.8's fast mode. You are paying Opus-fast-mode prices for a model that sits a full tier higher. Whether that is a deal depends entirely on the task, which is what the rest of this comes down to.

What a Task Actually Costs

Start with the base case so the sticker is concrete. Take a 100K-in/20K-out call.

On Fable 5:

input:  100,000 tokens × $10/1M = $1.00
output:  20,000 tokens × $50/1M = $1.00
total                           = $2.00

On Opus 4.8:

input:  100,000 tokens × $5/1M  = $0.50
output:  20,000 tokens × $25/1M = $0.50
total                           = $1.00

Exactly 2x at identical token usage. A smaller 50K-in/10K-out coding call is $1.00 on Fable versus $0.50 on Opus. The ratio never changes on the rate card. What changes is everything around it.

The case that hurts is long context. A near-1M-token prompt at $10 per million input is roughly a $9 input bill before Fable writes a single useful token:

input:  900,000 tokens × $10/1M =  $9.00
output:   5,000 tokens × $50/1M =  $0.25
total                           =  $9.25 per call

Run that uncached across a workflow and the bill compounds fast. Which is the first lever.

Lever 1: Caching, the 10x Discount on Repeated Context

Cache hits on Fable 5 cost $1 per million tokens, versus $10 per million for fresh input. That is a 10x reduction on any context you reuse.

Take the $9.25 long-context call above and assume the 900K of context is a cache hit:

cached input:  900,000 tokens × $1/1M  =  $0.90
output:          5,000 tokens × $50/1M =  $0.25
total                                  =  $1.15 per call

From $9.25 to $1.15. If your agent reads the same large repo, spec, or document set across many calls, caching is the single biggest cost lever you have. The cache write costs a premium once ($12.50/1M for the 5-minute tier, $20/1M for the 1-hour tier), then every hit is cheap.

Lever 2: Effort, the Soft Dial

Effort is, in Anthropic's words, "the primary control for the trade-off between intelligence, latency, and cost on Claude Fable 5." It sets how many thinking tokens the model spends, and thinking tokens bill as output at $50 per million.

The levels are low, medium, high (the default), xhigh, and max. The guidance is to use high for most tasks, xhigh for the most capability-sensitive work, max when correctness outweighs cost, and medium or low for routine jobs. The key insight for cost: lower effort settings on Fable 5 "still perform well and often exceed xhigh performance on prior models."

Read that twice. Fable 5 at medium effort often beats Opus 4.8 at its top effort. Anthropic's FrontierCode result backs it up, where Fable leads frontier models even at medium effort. So the cost lever is frequently not "switch to a cheaper model." It is "turn Fable's effort down."

Picture a hard task that burns roughly 40K thinking tokens at xhigh and 12K at medium (illustrative; Anthropic does not publish exact per-effort counts). At $50 per million, that is about $2.00 of thinking versus $0.60 on the same job. Anthropic's own advice is direct: "Reduce effort if a task completes but takes longer than necessary." Note that adaptive thinking is always on, so you can shrink the depth but you cannot turn thinking off.

Lever 3: Task Budgets, the Hard Cap

Effort is a dial. Task budgets are a wall.

The task-budgets beta (header task-budgets-2026-03-13, minimum 20,000 tokens) lets you cap the total tokens an agentic loop can consume. Where effort nudges spend down on average, a task budget guarantees a single autonomous run cannot blow past a ceiling you set.

This matters more on Fable 5 than on any prior model, because its turns run long by design. Individual hard requests can run for minutes at higher effort, and autonomous runs can extend for hours. One launch-day user reported Fable 5 "eating my Max 20x plan at ~2% per minute." A task budget is how you make sure a runaway loop stops at a number you chose instead of a number the model chose.

Lever 4: Batch API, Half Price for Offline Work

Anything that does not need to happen in real time should go through the batch API. It cuts the rate in half: $5/$25 instead of $10/$50. For overnight evals, bulk document processing, and offline pipelines, that is a flat 50% off the most expensive part of your bill.

Lever 5: Routing, Only the Hard Tail Goes to Fable

The biggest lever is also the simplest. Most traffic does not need a Mythos-class model. Default routine work to Opus 4.8 or Sonnet 4.6 and send only the hard, long-horizon, failure-prone tail to Fable 5.

At enterprise scale the stakes are real. On pure routine output, billing analysts have modeled 5 billion output tokens a year at roughly $250,000 on Fable 5 versus $125,000 on Opus 4.8. For classification, summarization, and RAG retrieval, that delta buys nothing, because Opus already clears the quality bar. Promote a task to Fable only when a cheaper model demonstrably fails, loses the plan mid-task, or burns more total tokens through retries.

The Fallback Pricing Quirk

This is the part of Fable 5's pricing with no equivalent anywhere else, and it is worth understanding exactly.

Fable 5 runs safety classifiers for cybersecurity, biology and chemistry, and distillation. When one trips, the request is handled by Opus 4.8 instead, and you are billed at Opus rates. Anthropic says this fires in under 5% of sessions. Two billing cases follow, straight from the AWS launch documentation:

A whole request routed to Opus 4.8. If the classifier trips at the start, the entire response comes from Opus 4.8 and bills entirely at Opus prices ($5/$25). You are not charged Fable rates at all.

A request blocked mid-conversation. If the block happens partway through, the initial tokens (processed by Fable before the block) bill at Fable rates ($10/$50), and the subsequent tokens (the Opus response) bill at Opus rates ($5/$25). A single request, split across two rate cards.

Practically, this means bio, chem, or security-adjacent workloads get a quiet, partial discount whenever the classifier reroutes them. For those domains the fallback rate is higher than the 5% average, because the classifiers are deliberately broad. It is unpredictable, which is its own reason to route that traffic to Opus by choice rather than discover it on the invoice.

One implementation note: refusals can arrive as a successful HTTP 200 with stop_reason: "refusal". Production code must check the stop reason instead of assuming every 200 is a billed Fable answer, and API customers configure server-side or client-side fallback to Opus 4.8 explicitly. It is not automatic the way it is in the Claude apps.

When the 2x Sticker Lies

The rate card says double. On the right task, your effective cost lands below the cheaper model.

A frontier physics lab reported Fable 5 was its strongest model "while using a third of the reasoning tokens," reaching in 36 hours nearly where GPT-5.5 landed after four days. The math is unforgiving in Fable's favor: one-third the tokens at twice the per-token price is two-thirds the effective cost. On that class of long, deliberate work, Fable 5 is cheaper, not pricier.

The same pattern shows up elsewhere. A spreadsheet suite found Fable 5 beats Opus 4.8 at every effort level with fewer turns, finishing 25 to 30% faster. Base44 said apps that "took a hundred prompts a year ago, it now one-shots." Rakuten: "the extra thinking pays for itself." And Stripe ran a migration on a 50-million-line Ruby codebase in one day that was estimated at over two months of team effort, where the token bill is trivial against the salary it replaced.

Anthropic's Dianne Penn framed it for CNBC: customers want higher accuracy and benefit per dollar, early customers "noted an improvement in spend per task," and "you just get a higher ROI by having more intelligent models." The number to optimize is cost per completed task, not cost per token.

The Subscription Window (Now Closed)

Fable 5's subscription rollout was staged, and the free evaluation window has passed:

From June 9 through June 22, 2026, Fable 5 was included on Pro, Max, Team, and seat-based Enterprise plans at no extra cost.
On June 23 it left those plans. Using it on a subscription since then requires usage credits. Anthropic said it might extend the window if capacity allowed, and aims to eventually restore Fable 5 as a standard part of subscription plans, with no committed date, so check your current plan terms.
On the API and consumption-based Enterprise plans, Fable 5 is metered at $10/$50 from day one, unchanged.

The benchmark still matters even though the free window is gone: run your real tasks on Fable 5, measure spend per completed task against Opus 4.8, and decide whether it earns a place on your usage credits before you route budget to it.

The Cost-Control Playbook

Put the levers together and the strategy is short:

Route by task. Default routine traffic to Opus 4.8 or Sonnet 4.6; reserve Fable 5 for the hard, long-horizon tail.
Turn effort down before switching models. Medium effort on Fable often beats Opus at its top effort, at a fraction of the thinking-token spend.
Cache aggressively. Cache hits are $1/1M versus $10/1M fresh, a 10x lever on repeated context.
Cap loops with task budgets. The beta header caps an agentic run (minimum 20,000 tokens) so it cannot run away.
Batch offline work. Half price at $5/$25 for anything that does not need real time.
Benchmark before committing credits. Measure spend per completed task against Opus 4.8 on your real workloads before you route budget to Fable.

The Verdict

Fable 5's pricing is simple to state and easy to misread. The sticker is exactly 2x Opus 4.8, but the sticker is the wrong number. Effort, task budgets, caching, batch, routing, and Fable's own token efficiency all bend the real bill, sometimes below the cheaper model on hard tasks and well above it on routine ones.

Spend your attention on the levers, not the rate card. Route the easy work to cheaper models, send only the hard tail to Fable, tune effort and cap budgets, and measure your own spend-per-task against Opus 4.8 before you commit usage credits.

Frequently Asked Questions

How much does Claude Fable 5 cost?

Claude Fable 5 costs $10 per million input tokens and $50 per million output tokens, exactly double Opus 4.8 ($5/$25). A 100K-in/20K-out task runs $2.00 on Fable 5 versus $1.00 on Opus 4.8. The batch API halves the rate to $5/$25, and cache hits drop input to $1 per million.

Why is Claude Fable 5 twice the price of Opus 4.8?

Fable 5 is the first publicly available Mythos-class model, a capability tier above the Opus class, and the price reflects that tier. It is still less than half the price of Mythos Preview (~$30/$150), the restricted model it descends from. Anthropic argues the higher ROI per completed task can offset the per-token premium on hard work.

How does Claude Fable 5 fallback pricing work?

When Fable 5's safety classifiers route a request to Opus 4.8, you pay Opus rates, not Fable rates. If a request is blocked mid-conversation, the initial tokens bill at Fable rates and the subsequent tokens bill at Opus rates. This fallback fires in under 5% of sessions on typical workloads.

How do I control costs on Claude Fable 5?

Use five levers: lower the reasoning effort (medium often beats Opus at top effort), cap agentic loops with task budgets (minimum 20,000 tokens), cache repeated context (cache hits are $1/1M versus $10/1M), use the batch API for offline work (half price), and route only hard tasks to Fable while keeping routine work on Opus 4.8 or Sonnet 4.6.

Is Claude Fable 5 free right now?

The free subscription window (June 9 through June 22, 2026, on Pro, Max, Team, and seat-based Enterprise plans) has closed. Since June 23, using Fable 5 on a subscription requires usage credits; Anthropic signaled it may extend or later restore standard access, so check your current plan terms. On the API and consumption-based Enterprise plans, it is metered at $10/$50 from day one.

Can token efficiency make Claude Fable 5 cheaper than Opus 4.8?

On the right task, yes. A physics lab reported Fable 5 using a third of the reasoning tokens of a rival model, which at 2x the per-token price works out to two-thirds the effective cost. Fewer turns and higher first-try success rates reduce spend per completed task, even though the rate card is double.