Does the June 15 Claude Code billing change affect running Fable 5 in agents?

Yes. Starting June 15, 2026, Anthropic splits Claude subscription usage into two pools: interactive use (web, desktop, the Claude Code terminal) stays on your standard plan, while Agent SDK and headless agent runs (claude -p, GitHub Actions, third-party agent apps) move to a separate credit pool that meters at API rates. Fable 5 is token-hungry, so if you fan it out across autonomous agents, watch which pool your runs land in.

When does free Fable 5 access end?

Fable 5 is included at no extra cost on Pro, Max, Team, and seat-based Enterprise plans from launch through June 22, 2026. On June 23 it moves to usage credits, metered at $10 per million input tokens and $50 per million output tokens, until Anthropic restores standard access as capacity allows.

Claude Fable 5 Use Cases

Update: this billing change was paused. Anthropic paused the June 15, 2026 Claude Code billing split on the day it was due to take effect. The Agent SDK, claude -p, and GitHub Actions still draw from your normal subscription limits, and the separate per-user credit pool is not live. Anything below that describes the split describes what it would have done. For the current picture, see Claude Code costs after June 15.

In its first days of early access, Claude Fable 5 ran a codebase-wide migration across Stripe's 50-million-line Ruby codebase in a single day, work a whole team would have spent over two months doing by hand. It also rebuilt a web app's source code from screenshots alone, broke 90% on Hex's analytics benchmark, and shipped a week's worth of library features for an independent developer in one afternoon.

This is not a feature list. It is a proof list. Below is what real teams and one very public independent tester actually did with claude-fable-5 in its first 48 hours, with the names and the numbers attached.

A note on sourcing before you read. Most of these accounts come from early-access customers Anthropic quoted in its launch announcement, so they are first-party and vendor-curated. We flag which is which. The strongest independent signal comes from developer Simon Willison, who had no early access and ran his own tests on launch day.

The Proof List at a Glance

Company / test	Use case	Result
Stripe	Codebase-wide migration, 50M-line Ruby codebase	1 day vs over 2 months for a whole team
Cognition (Devin)	FrontierCode coding eval	Highest of any frontier model, even at medium effort
Cursor	Long-horizon coding (CursorBench)	State of the art; unlocked previously out-of-reach problems
GitHub	Complex long-horizon coding	Autonomy and reliability beyond previous benchmarks
Base44	One-shotting full apps	Apps that took 100 prompts a year ago now one-shot
Genspark	UI design and game coding	Beat every other model tested
Hebbia	Finance Benchmark (senior reasoning)	Highest score of any model
IMC	Trading-analysis evals	Aced them nearly across the board
Hex	Core analytics benchmark	First model to break 90%, a 10-point jump over Opus
Physics lab	Frontier physics research	One third of the reasoning tokens; 36 hours got near GPT-5.5's four days
Legal team	Contract redlines (blind review)	Matched or beat their current model every time
Spreadsheet suite	Everyday spreadsheet tasks	Beats Opus 4.8 at every effort, 25-30% faster
Rakuten	Highly autonomous operations	Validates its own work; "the extra thinking pays for itself"
Anthropic (vision)	Rebuild web app from screenshots	Reconstructed source from screenshots alone
Simon Willison	MicroPython to full CPython in WASM	Working installable wheel in a day

Coding, Migrations, and Long-Horizon Engineering

This is the category where Fable 5's lead is widest, and Anthropic is explicit about why: the longer and more complex the task, the bigger Fable's advantage over its other models.

The flagship example is Stripe. According to Anthropic's announcement, Stripe reported that Fable 5 "compressed months of engineering into days." In a 50-million-line Ruby codebase, the model performed a codebase-wide migration in a day that would otherwise have taken a whole team over two months by hand. That is the kind of work that normally gets scoped into quarters, not afternoons.

The agent and editor companies tell a consistent story. Cursor reported that Fable 5 is "the state of the art model on CursorBench" and that "it's opened up a class of long-horizon problems that were out of reach for earlier models." Cognition, the team behind Devin, said it is the highest-scoring model on their FrontierBench coding eval, that it "excels at long-horizon reasoning and generalizes to unfamiliar tools out of the box," and that it scores highest among frontier models even at medium effort. GitHub said that in early testing it took on complex, long-horizon coding tasks "with a level of autonomy and reliability that exceeded previous benchmarks."

For builders without a large legacy codebase, the vibe-coding numbers matter more. Base44 reported that "apps that took a hundred prompts a year ago, it now one-shots," and told TechCrunch that Fable is better at one-shotting full apps with excellent tool-calling. Genspark told TechCrunch that Fable beat every other model in its evaluations and was significantly better at UI design and game coding.

The one fully independent account comes from Simon Willison, who had no early access. In about five and a half hours on launch day, he used Fable inside Claude Code to add a human-in-the-loop pause-and-approve feature to his Datasette Agent project. When he told it that changes to his underlying LLM library were also in scope, it implemented four upstream features to support the work cleanly, then shipped them as a release. His verdict: "I spent several hours on it today, but it feels like several days' worth of work," and he praised the quality of the API design, tests, code, and documentation.

What this means for you: the unlock is not "writes code faster," it is "stays coherent across a job too big to babysit." If you have a migration, a refactor, or a feature that spans many files and would normally eat a sprint, this is the model you point at it. For small daily edits, Sonnet is still the cheaper, faster call.

Knowledge Work: Finance, Analytics, and Research

Fable 5 is not just a coding model. Some of the sharpest early results came from analysts.

Hex, the analytics platform, said Fable 5 was "the first to break 90% on our core analytics benchmark of complex, long-running analytical tasks," a 10-point jump over Opus, adding that "on the hardest questions, it shows strong judgment and attention to nuance." TechCrunch independently re-reported that result, which makes it one of the better-corroborated claims in the launch.

In finance, Hebbia reported that Fable 5 has the highest score of any model on its Finance Benchmark for senior-level reasoning, with substantial gains in document-based reasoning and chart and table interpretation. The trading firm IMC said Fable "aced their trading-analysis evaluations nearly across the board," including factual lookup, conceptual reasoning, root-cause analysis, and expected-value analysis.

The research results are the most striking. A physics research lab told Anthropic that Fable 5 is "the strongest model we've tested on frontier physics research while using a third of the reasoning tokens," and that "in 36 hours it got nearly to where GPT-5.5 landed after four days." Less compute, less time, comparable destination.

Even the unglamorous spreadsheet work improved. One customer reported that Fable beats Opus 4.8 on their everyday spreadsheet suite at every effort level, finishing runs 25 to 30% faster with fewer turns.

What this means for you: if your work is reading dense source material and getting the details right, finance memos, analytics pipelines, research synthesis, the gains here are about judgment under ambiguity, not raw speed. The token-efficiency angle is real too. Faster runs at lower effort levels can offset the higher per-token price.

Vision: Screenshots In, Code Out

Anthropic calls Fable 5 the new state of the art for vision tasks, and the examples are concrete rather than abstract.

The headline one for builders: Fable 5 can rebuild a web app's source code from screenshots alone. It can also extract precise numbers from detailed scientific figures, the kind of chart-reading that usually requires a human to transcribe.

The clearest demonstration of how far the vision gains go is a game. Earlier Claude models struggled to play Pokemon FireRed even when given harnesses full of helper tools, maps, and game-state information. Fable 5 beat the game using a minimal, vision-only harness, working from nothing but raw screenshots. The model is doing the navigation and planning itself, off the pixels, instead of leaning on scaffolding someone built for it.

What this means for you: screenshot-to-code and figure-extraction are now reliable enough to put in a workflow. If you have design mocks, dashboard captures, or scientific PDFs, you can hand them over directly instead of transcribing first. Less scaffolding required is the practical theme: the model meets messy real interfaces with fewer custom tools.

Long-Running Agents, Memory, and Self-Validation

The trait that makes all of the above usable is what happens when no human is watching.

Rakuten put it plainly in a statement reported by TechCrunch: "At the highest effort, Claude Fable 5 reflects on and validates its own work. For us, that's what makes highly autonomous operations possible. The extra thinking pays for itself." That self-check is the difference between an agent you can leave running and one you have to re-verify line by line.

Memory compounds the effect. In Anthropic's own test, the model played the deck-building game Slay the Spire with access to persistent file-based memory. That memory improved Fable's performance three times more than it improved Opus 4.8's, and Fable reached the game's final act three times more often. The model is not just remembering, it is improving its own play from its own notes across a long run.

On the agent-orchestration side, Anthropic's documentation says Fable 5 is significantly more dependable at dispatching and sustaining parallel subagents and at managing communication with long-running ones. One early customer reported that it "delivers more capable engineering in fewer turns" while handling the complex multi-agent Claude Code workflows their employees run daily.

What this means for you: this is the model for work you kick off and walk away from. If you run agents overnight, fan out subagents across a large job, or build autonomous pipelines, the self-validation is the load-bearing feature. It is also why people are reaching for it on jobs Opus 4.8 could not finish unsupervised.

Science, via the Same Underlying Model

The most dramatic results came from Mythos 5, which is the same underlying model as Fable 5 with the safety classifiers lifted. Worth reading with one caveat: public Fable 5 falls back to Opus 4.8 on most biology and chemistry queries, so you cannot necessarily reproduce these on the public model. They show what the model class is capable of, not what an open API call will do today.

With that flagged, the numbers are notable. Anthropic's internal protein-design experts reported accelerating parts of the drug-design process by around ten times. Running with protein-design and bioinformatics tools but no human assistance, the model matched or beat skilled human operators, choosing binding sites, selecting and running tools, and recovering from its own failures. Nine of the 14 protein targets in the study yielded strong drug-design candidates.

In molecular biology, Anthropic's scientists preferred the model's hypotheses about 80% of the time over Opus-class models in blinded comparisons, and one hypothesis, a novel mechanism for an E. coli protein, was independently corroborated by another lab working on the same problem. In genomics, the model ran over a week of largely autonomous work, assembled single-cell data across 138 animal species, and trained a custom model that outperformed a recent Science-published model while being 100 times smaller.

What this means for you: unless you are in a trusted-access research program, treat these as a ceiling demo rather than a daily capability. The signal for builders is the shape of it: a model that can run for a week, recover from its own dead ends, and produce a result worth publishing is the same engine doing your migrations.

The Catch: Cost, Guardrails, and a Closing Window

Fable 5 is the most capable model Anthropic has released to the public, and the trade-offs are honest ones.

It is expensive. Pricing is $10 per million input tokens and $50 per million output tokens, double Opus 4.8 and the same as the much-pricier Mythos Preview was at half its old rate. Simon Willison burned $110 of tokens in a single day of testing. The model is also slow, the flip side of feeling, in his words, "something of a beast." The token-efficiency gains some customers reported can soften the bill, but you should measure on your own workloads before committing.

There are guardrails. When Fable's classifiers detect a query about cybersecurity, biology and chemistry, or model distillation, the response is handled by Opus 4.8 instead and you are told. Anthropic's early data shows this happens in fewer than 5% of sessions, so for the vast majority of work you get Fable's full capability. But the fallbacks are tuned conservatively and will occasionally catch harmless requests.

There is also a clock. From launch through June 22, 2026, Fable 5 is included on Pro, Max, Team, and seat-based Enterprise plans at no extra cost. On June 23 it leaves those plans and requires usage credits, with Anthropic aiming to restore it to standard subscriptions once capacity allows. If you want to test it on your own work without a separate bill, that window is the time.

One more thing to watch if you run Fable 5 inside agents. On June 15, 2026, Anthropic splits Claude subscription usage into two pools. Interactive work (web, desktop, the Claude Code terminal) stays on your plan, but Agent SDK and headless runs (claude -p, GitHub Actions, third-party agent apps) move to a separate credit pool metered at API rates. Fable 5 is the most token-hungry public model, so a fanned-out overnight agent run can drain that pool fast. See the June 15 billing change explained for which workflows land in which pool.

Frequently Asked Questions

What has Claude Fable 5 actually been used for?

Real early-access work, mostly large coding jobs and analysis. Stripe ran a codebase-wide migration in a 50-million-line Ruby codebase in one day. Hex broke 90% on its analytics benchmark. Hebbia and IMC topped their finance and trading evals. Anthropic also showed it rebuilding a web app's source from screenshots and playing Pokemon FireRed from raw pixels. Most accounts come from Anthropic's launch announcement, so they are first-party.

Is Claude Fable 5 good at coding?

The early evidence says yes, especially for big, long-running jobs. Cursor called it state of the art on CursorBench, Cognition ranked it highest on their FrontierBench coding eval, and GitHub reported autonomy and reliability beyond previous benchmarks. Independent tester Simon Willison shipped a week's worth of library features in an afternoon with it. For small daily edits, a cheaper model like Sonnet is usually the better call.

How much does Claude Fable 5 cost?

It is $10 per million input tokens and $50 per million output tokens, double the price of Opus 4.8. The model ID is claude-fable-5. It is included free on Pro, Max, Team, and seat-based Enterprise plans through June 22, 2026, after which it requires usage credits until capacity allows a return to standard plans.

Why does Claude Fable 5 sometimes answer like a different model?

Fable 5 ships with safety classifiers. When a query touches cybersecurity, biology and chemistry, or attempts to distill the model, the response is handled by Opus 4.8 instead and you are notified. Anthropic says this fallback triggers in fewer than 5% of sessions, so most work runs on Fable 5 at full capability.

Can Claude Fable 5 do the science demos Anthropic showed?

Not directly on the public model in most cases. The protein design, genomics, and molecular biology results were produced by Mythos 5, the same underlying model with safeguards lifted, available only through trusted-access programs. Public Fable 5 falls back to Opus 4.8 on most biology and chemistry queries. Treat those results as a ceiling for the model class, not a daily public capability.

Is Claude Fable 5 worth it over Opus 4.8?

For long-horizon, autonomous, or high-stakes work, the early reports point to a clear step up. Customers consistently described it solving problems that were out of reach for earlier models, and it beats Opus 4.8 on benchmarks like the spreadsheet suite at every effort level. The trade-offs are real: double the price and slower runs. For routine work, Opus 4.8 or Sonnet remains the more economical choice.