Build This Now
Build This Now
Claude Code ModelsClaude Fable 5 CheatsheetClaude Fable 5 vs Opus 4.8Claude Fable 5 Use CasesClaude Fable 5 Pricing & Cost ControlClaude Fable 5 API GuideClaude Fable 5 in Claude CodeClaude Fable 5 Safeguards ExplainedOpus 4.8 CheatsheetDeepSeek V4: Pricing, Context, and MigrationClaude Code Quality Regression: What Actually HappenedClaude Opus 4.7 vs GPT-5.5Claude Opus 4.7 vs Other AI ModelsClaude Mythos: The Model That Thinks in LoopsClaude Opus 4.5 in Claude CodeClaude Opus 4.7Claude Opus 4.7 vs 4.6Claude Opus 4.7 Use CasesClaude Opus 4.6Claude Sonnet 4.6Claude Opus 4.5Claude Sonnet 4.5Claude Haiku 4.5Claude Opus 4.1Claude 4Claude 3.7 SonnetClaude 3.5 Sonnet v2 and Claude 3.5 HaikuClaude 3.5 SonnetClaude 3Every Claude ModelBest AI Model for Coding in 2026 (Tested & Ranked)
speedy_devvkoen_salo
Blog/Model Picker/Claude Fable 5 Use Cases

Claude Fable 5 Use Cases

What people actually did with Claude Fable 5 in early access: a Stripe migration in a day, Hex breaking 90% on analytics, web apps rebuilt from screenshots, and a coding agent that ships a week of work in an afternoon. Real implementations with names and numbers.

Stop configuring. Start building.

SaaS builder templates with AI orchestration.

Published Jun 10, 202612 min readModel Picker hub

In its first days of early access, Claude Fable 5 ran a codebase-wide migration across Stripe's 50-million-line Ruby codebase in a single day, work a whole team would have spent over two months doing by hand. It also rebuilt a web app's source code from screenshots alone, broke 90% on Hex's analytics benchmark, and shipped a week's worth of library features for an independent developer in one afternoon.

This is not a feature list. It is a proof list. Below is what real teams and one very public independent tester actually did with claude-fable-5 in its first 48 hours, with the names and the numbers attached.

A note on sourcing before you read. Most of these accounts come from early-access customers Anthropic quoted in its launch announcement, so they are first-party and vendor-curated. We flag which is which. The strongest independent signal comes from developer Simon Willison, who had no early access and ran his own tests on launch day.

The Proof List at a Glance

Company / testUse caseResult
StripeCodebase-wide migration, 50M-line Ruby codebase1 day vs over 2 months for a whole team
Cognition (Devin)FrontierCode coding evalHighest of any frontier model, even at medium effort
CursorLong-horizon coding (CursorBench)State of the art; unlocked previously out-of-reach problems
GitHubComplex long-horizon codingAutonomy and reliability beyond previous benchmarks
Base44One-shotting full appsApps that took 100 prompts a year ago now one-shot
GensparkUI design and game codingBeat every other model tested
HebbiaFinance Benchmark (senior reasoning)Highest score of any model
IMCTrading-analysis evalsAced them nearly across the board
HexCore analytics benchmarkFirst model to break 90%, a 10-point jump over Opus
Physics labFrontier physics researchOne third of the reasoning tokens; 36 hours got near GPT-5.5's four days
Legal teamContract redlines (blind review)Matched or beat their current model every time
Spreadsheet suiteEveryday spreadsheet tasksBeats Opus 4.8 at every effort, 25-30% faster
RakutenHighly autonomous operationsValidates its own work; "the extra thinking pays for itself"
Anthropic (vision)Rebuild web app from screenshotsReconstructed source from screenshots alone
Simon WillisonMicroPython to full CPython in WASMWorking installable wheel in a day

Coding, Migrations, and Long-Horizon Engineering

This is the category where Fable 5's lead is widest, and Anthropic is explicit about why: the longer and more complex the task, the bigger Fable's advantage over its other models.

The flagship example is Stripe. According to Anthropic's announcement, Stripe reported that Fable 5 "compressed months of engineering into days." In a 50-million-line Ruby codebase, the model performed a codebase-wide migration in a day that would otherwise have taken a whole team over two months by hand. That is the kind of work that normally gets scoped into quarters, not afternoons.

The agent and editor companies tell a consistent story. Cursor reported that Fable 5 is "the state of the art model on CursorBench" and that "it's opened up a class of long-horizon problems that were out of reach for earlier models." Cognition, the team behind Devin, said it is the highest-scoring model on their FrontierBench coding eval, that it "excels at long-horizon reasoning and generalizes to unfamiliar tools out of the box," and that it scores highest among frontier models even at medium effort. GitHub said that in early testing it took on complex, long-horizon coding tasks "with a level of autonomy and reliability that exceeded previous benchmarks."

For builders without a large legacy codebase, the vibe-coding numbers matter more. Base44 reported that "apps that took a hundred prompts a year ago, it now one-shots," and told TechCrunch that Fable is better at one-shotting full apps with excellent tool-calling. Genspark told TechCrunch that Fable beat every other model in its evaluations and was significantly better at UI design and game coding.

The one fully independent account comes from Simon Willison, who had no early access. In about five and a half hours on launch day, he used Fable inside Claude Code to add a human-in-the-loop pause-and-approve feature to his Datasette Agent project. When he told it that changes to his underlying LLM library were also in scope, it implemented four upstream features to support the work cleanly, then shipped them as a release. His verdict: "I spent several hours on it today, but it feels like several days' worth of work," and he praised the quality of the API design, tests, code, and documentation.

What this means for you: the unlock is not "writes code faster," it is "stays coherent across a job too big to babysit." If you have a migration, a refactor, or a feature that spans many files and would normally eat a sprint, this is the model you point at it. For small daily edits, Sonnet is still the cheaper, faster call.

Knowledge Work: Finance, Analytics, and Research

Fable 5 is not just a coding model. Some of the sharpest early results came from analysts.

Hex, the analytics platform, said Fable 5 was "the first to break 90% on our core analytics benchmark of complex, long-running analytical tasks," a 10-point jump over Opus, adding that "on the hardest questions, it shows strong judgment and attention to nuance." TechCrunch independently re-reported that result, which makes it one of the better-corroborated claims in the launch.

In finance, Hebbia reported that Fable 5 has the highest score of any model on its Finance Benchmark for senior-level reasoning, with substantial gains in document-based reasoning and chart and table interpretation. The trading firm IMC said Fable "aced their trading-analysis evaluations nearly across the board," including factual lookup, conceptual reasoning, root-cause analysis, and expected-value analysis.

The research results are the most striking. A physics research lab told Anthropic that Fable 5 is "the strongest model we've tested on frontier physics research while using a third of the reasoning tokens," and that "in 36 hours it got nearly to where GPT-5.5 landed after four days." Less compute, less time, comparable destination.

Even the unglamorous spreadsheet work improved. One customer reported that Fable beats Opus 4.8 on their everyday spreadsheet suite at every effort level, finishing runs 25 to 30% faster with fewer turns.

What this means for you: if your work is reading dense source material and getting the details right, finance memos, analytics pipelines, research synthesis, the gains here are about judgment under ambiguity, not raw speed. The token-efficiency angle is real too. Faster runs at lower effort levels can offset the higher per-token price.

Vision: Screenshots In, Code Out

Anthropic calls Fable 5 the new state of the art for vision tasks, and the examples are concrete rather than abstract.

The headline one for builders: Fable 5 can rebuild a web app's source code from screenshots alone. It can also extract precise numbers from detailed scientific figures, the kind of chart-reading that usually requires a human to transcribe.

The clearest demonstration of how far the vision gains go is a game. Earlier Claude models struggled to play Pokemon FireRed even when given harnesses full of helper tools, maps, and game-state information. Fable 5 beat the game using a minimal, vision-only harness, working from nothing but raw screenshots. The model is doing the navigation and planning itself, off the pixels, instead of leaning on scaffolding someone built for it.

What this means for you: screenshot-to-code and figure-extraction are now reliable enough to put in a workflow. If you have design mocks, dashboard captures, or scientific PDFs, you can hand them over directly instead of transcribing first. Less scaffolding required is the practical theme: the model meets messy real interfaces with fewer custom tools.

Long-Running Agents, Memory, and Self-Validation

The trait that makes all of the above usable is what happens when no human is watching.

Rakuten put it plainly in a statement reported by TechCrunch: "At the highest effort, Claude Fable 5 reflects on and validates its own work. For us, that's what makes highly autonomous operations possible. The extra thinking pays for itself." That self-check is the difference between an agent you can leave running and one you have to re-verify line by line.

Memory compounds the effect. In Anthropic's own test, the model played the deck-building game Slay the Spire with access to persistent file-based memory. That memory improved Fable's performance three times more than it improved Opus 4.8's, and Fable reached the game's final act three times more often. The model is not just remembering, it is improving its own play from its own notes across a long run.

On the agent-orchestration side, Anthropic's documentation says Fable 5 is significantly more dependable at dispatching and sustaining parallel subagents and at managing communication with long-running ones. One early customer reported that it "delivers more capable engineering in fewer turns" while handling the complex multi-agent Claude Code workflows their employees run daily.

What this means for you: this is the model for work you kick off and walk away from. If you run agents overnight, fan out subagents across a large job, or build autonomous pipelines, the self-validation is the load-bearing feature. It is also why people are reaching for it on jobs Opus 4.8 could not finish unsupervised.

Science, via the Same Underlying Model

The most dramatic results came from Mythos 5, which is the same underlying model as Fable 5 with the safety classifiers lifted. Worth reading with one caveat: public Fable 5 falls back to Opus 4.8 on most biology and chemistry queries, so you cannot necessarily reproduce these on the public model. They show what the model class is capable of, not what an open API call will do today.

With that flagged, the numbers are notable. Anthropic's internal protein-design experts reported accelerating parts of the drug-design process by around ten times. Running with protein-design and bioinformatics tools but no human assistance, the model matched or beat skilled human operators, choosing binding sites, selecting and running tools, and recovering from its own failures. Nine of the 14 protein targets in the study yielded strong drug-design candidates.

In molecular biology, Anthropic's scientists preferred the model's hypotheses about 80% of the time over Opus-class models in blinded comparisons, and one hypothesis, a novel mechanism for an E. coli protein, was independently corroborated by another lab working on the same problem. In genomics, the model ran over a week of largely autonomous work, assembled single-cell data across 138 animal species, and trained a custom model that outperformed a recent Science-published model while being 100 times smaller.

What this means for you: unless you are in a trusted-access research program, treat these as a ceiling demo rather than a daily capability. The signal for builders is the shape of it: a model that can run for a week, recover from its own dead ends, and produce a result worth publishing is the same engine doing your migrations.

The Catch: Cost, Guardrails, and a Closing Window

Fable 5 is the most capable model Anthropic has released to the public, and the trade-offs are honest ones.

It is expensive. Pricing is $10 per million input tokens and $50 per million output tokens, double Opus 4.8 and the same as the much-pricier Mythos Preview was at half its old rate. Simon Willison burned $110 of tokens in a single day of testing. The model is also slow, the flip side of feeling, in his words, "something of a beast." The token-efficiency gains some customers reported can soften the bill, but you should measure on your own workloads before committing.

There are guardrails. When Fable's classifiers detect a query about cybersecurity, biology and chemistry, or model distillation, the response is handled by Opus 4.8 instead and you are told. Anthropic's early data shows this happens in fewer than 5% of sessions, so for the vast majority of work you get Fable's full capability. But the fallbacks are tuned conservatively and will occasionally catch harmless requests.

There is also a clock. From launch through June 22, 2026, Fable 5 is included on Pro, Max, Team, and seat-based Enterprise plans at no extra cost. On June 23 it leaves those plans and requires usage credits, with Anthropic aiming to restore it to standard subscriptions once capacity allows. If you want to test it on your own work without a separate bill, that window is the time.

Frequently Asked Questions

What has Claude Fable 5 actually been used for?

Real early-access work, mostly large coding jobs and analysis. Stripe ran a codebase-wide migration in a 50-million-line Ruby codebase in one day. Hex broke 90% on its analytics benchmark. Hebbia and IMC topped their finance and trading evals. Anthropic also showed it rebuilding a web app's source from screenshots and playing Pokemon FireRed from raw pixels. Most accounts come from Anthropic's launch announcement, so they are first-party.

Is Claude Fable 5 good at coding?

The early evidence says yes, especially for big, long-running jobs. Cursor called it state of the art on CursorBench, Cognition ranked it highest on their FrontierBench coding eval, and GitHub reported autonomy and reliability beyond previous benchmarks. Independent tester Simon Willison shipped a week's worth of library features in an afternoon with it. For small daily edits, a cheaper model like Sonnet is usually the better call.

How much does Claude Fable 5 cost?

It is $10 per million input tokens and $50 per million output tokens, double the price of Opus 4.8. The model ID is claude-fable-5. It is included free on Pro, Max, Team, and seat-based Enterprise plans through June 22, 2026, after which it requires usage credits until capacity allows a return to standard plans.

Why does Claude Fable 5 sometimes answer like a different model?

Fable 5 ships with safety classifiers. When a query touches cybersecurity, biology and chemistry, or attempts to distill the model, the response is handled by Opus 4.8 instead and you are notified. Anthropic says this fallback triggers in fewer than 5% of sessions, so most work runs on Fable 5 at full capability.

Can Claude Fable 5 do the science demos Anthropic showed?

Not directly on the public model in most cases. The protein design, genomics, and molecular biology results were produced by Mythos 5, the same underlying model with safeguards lifted, available only through trusted-access programs. Public Fable 5 falls back to Opus 4.8 on most biology and chemistry queries. Treat those results as a ceiling for the model class, not a daily public capability.

Is Claude Fable 5 worth it over Opus 4.8?

For long-horizon, autonomous, or high-stakes work, the early reports point to a clear step up. Customers consistently described it solving problems that were out of reach for earlier models, and it beats Opus 4.8 on benchmarks like the spreadsheet suite at every effort level. The trade-offs are real: double the price and slower runs. For routine work, Opus 4.8 or Sonnet remains the more economical choice.

Sources

  • Claude Fable 5 and Claude Mythos 5 (Anthropic)
  • Anthropic's Claude Fable 5 is a version of Mythos the public can access today (TechCrunch)
  • Anthropic releases Fable 5, the first public Mythos-class model (NBC News)
  • Anthropic is releasing a public version of its Mythos AI model as Claude Fable 5 (Quartz)
  • Initial impressions of Claude Fable 5 (Simon Willison)

Related Pages

  • Claude Opus 4.8
  • Claude Opus 4.7 use cases
  • Claude Code Models
  • Claude Mythos and OpenMythos

More in Model Picker

  • Claude Mythos: The Model That Thinks in Loops
    Claude Mythos is suspected to use recurrent-depth architecture: one shared layer looped N times, with ACT halting so hard questions get more passes and easy ones stop early.
  • Claude Opus 4.7 vs Other AI Models
    Claude Opus 4.7, GPT-5.4, Kimi K2.6, Gemini 3.1 Pro, DeepSeek V3.2: benchmarks, context windows, agent reliability, and cost, so you reach for the right one.
  • DeepSeek V4: Pricing, Context, and Migration
    DeepSeek V4 ships two models: V4-Flash at $0.28/M output and V4-Pro at $3.48/M. Both carry a genuine 1M context window and drop into any Anthropic-compatible SDK with one line changed.
  • Every Claude Model
    Every Claude model on one page: Claude 3, 3.5, 3.7, 4, Opus 4.1 to 4.6, Sonnet 4.5 and 4.6, Haiku 4.5. Specs, pricing, benchmarks, and when to use each.
  • Best AI Model for Coding in 2026 (Tested & Ranked)
    The best AI model for coding in 2026, ranked by use case and budget: Claude Opus 4.8 for hardest agentic work, GPT-5.5 for terminal agents, DeepSeek V4 for value, with cited benchmarks.
  • Claude 3.5 Sonnet v2 and Claude 3.5 Haiku
    Claude 3.5 Sonnet v2 and 3.5 Haiku launched October 2024 with Computer Use beta, cursor control, upgraded coding and tool use, and cheaper Haiku at $0.80/$4.

Stop configuring. Start building.

SaaS builder templates with AI orchestration.

On this page

The Proof List at a Glance
Coding, Migrations, and Long-Horizon Engineering
Knowledge Work: Finance, Analytics, and Research
Vision: Screenshots In, Code Out
Long-Running Agents, Memory, and Self-Validation
Science, via the Same Underlying Model
The Catch: Cost, Guardrails, and a Closing Window
Frequently Asked Questions
What has Claude Fable 5 actually been used for?
Is Claude Fable 5 good at coding?
How much does Claude Fable 5 cost?
Why does Claude Fable 5 sometimes answer like a different model?
Can Claude Fable 5 do the science demos Anthropic showed?
Is Claude Fable 5 worth it over Opus 4.8?
Sources
Related Pages

Stop configuring. Start building.

SaaS builder templates with AI orchestration.