Claude Code Ultra Review
A fleet of cloud agents fans out across your PR diff, independently verifies every finding, and surfaces only real bugs. What /ultrareview does, when to use it, and what it costs.
Stop configuring. Start building.
SaaS builder templates with AI orchestration.
Problem: Code review is a numbers game with bad odds. Human reviewers catch obvious mistakes, but large diffs wear attention down fast. A 1,500-line refactor touching auth, encryption, and three database tables? The subtle type mismatch on line 847 stays quiet. It ships.
/review helps. A single-pass scan in your local session catches a lot. But one agent, one pass, no verification step. Findings are good. Confidence is lower.
/ultrareview runs differently. It sends your diff to a cloud sandbox, spins up a fleet of agents, and runs each finding through an independent verification pass before anything surfaces. Only confirmed bugs come back.
What /ultrareview Actually Does
The command ships in Claude Code v2.1.111, launched April 16, 2026, as a research preview. Two invocation modes:
Branch mode reviews the diff between your current branch and the default branch, including anything uncommitted or staged:
/ultrareviewPR mode takes a GitHub PR number. The remote sandbox clones directly from GitHub instead of bundling your local working tree:
/ultrareview 1234Use PR mode when your repo is too large to bundle. Push the branch, open a draft PR, then pass the number.
The Four-Stage Pipeline
Every review runs through the same four stages in the cloud sandbox:
Setup: Anthropic provisions remote infrastructure and spins up a fleet of sub-agents. Default fleet is 5 agents. This takes roughly 90 seconds.
Find: Agents explore different execution paths through the changed code in parallel. Each one hunts independently, so race conditions, logic errors, and cross-module type mismatches get pressure from multiple angles at once.
Verify: A separate set of agents tries to reproduce each candidate finding. A bug that only one agent flagged gets challenged by another. If it can't be confirmed independently, it doesn't surface.
Dedup: Duplicate findings from different agents get merged into single ranked reports.
The confirmation dialog before launch shows the scope: file count, line count, remaining free runs, and estimated cost. The review only starts after you confirm.
Reviews run as background tasks. Your terminal stays free. Track progress with /tasks. Closing the terminal is fine. Stopping a review mid-run archives the cloud session and returns zero partial findings.
How It Compares to /review
/review | /ultrareview | |
|---|---|---|
| Runs | locally, in-session | remote cloud sandbox |
| Depth | single-pass | multi-agent fleet plus independent verification |
| Duration | seconds to a few minutes | 5 to 10 minutes (up to 20 on large PRs) |
| Cost | counts toward normal usage | free runs, then $5 to $20/review as extra usage |
| Best for | fast feedback while iterating | pre-merge confidence on substantial changes |
The defining difference is the verification stage. /review is one pass. /ultrareview surfaces only findings that survived a second agent trying to reproduce them. That is where the under-1% false positive rate comes from.
Results Worth Knowing
Anthropic ran this on their own PRs before launch. The numbers from internal testing (via claudefa.st):
| Metric | Result |
|---|---|
| Large PRs (1,000+ lines) with findings | 84%, averaging 7.5 issues per review |
| Small PRs (under 50 lines) with findings | 31%, averaging 0.5 issues |
| Findings marked incorrect by engineers | Under 1% |
| PRs with substantive review comments (before vs after) | 16% to 54% |
Two examples from real use. A one-line authentication change at Anthropic would have silently broken login flows. /ultrareview flagged it as critical before merge. In a TrueNAS ZFS encryption refactor, it surfaced a type mismatch that was wiping the encryption key cache on every sync. The kind of bug that lives in production for months before someone traces the intermittent failures back to the right commit.
A practitioner test on an 11,000-line voice calling PR (via mejba.me): 64 candidate bugs in the Find stage, a smaller confirmed set after Verify, 17 minutes total. Race conditions and state management issues across module boundaries. Things a single-pass review misses because no one agent is simultaneously holding the full picture.
Pricing
| Plan | Free runs | After free runs |
|---|---|---|
| Pro | 3 (expire May 5, 2026) | billed as extra usage |
| Max | 3 (expire May 5, 2026) | billed as extra usage |
| Team | none | billed as extra usage |
| Enterprise | none | billed as extra usage |
Each review costs approximately $5 to $20 depending on diff size. Free runs are a one-time allotment. They expire May 5, 2026 whether used or not. No renewal, no carry-over.
Before you run it: confirm extra usage is enabled on your account. Run /extra-usage to check. If extra usage is off, the feature blocks at launch. You cannot enable it from the confirmation dialog.
Platform Requirements
Required:
- Claude Code v2.1.111 or later
- Claude.ai account authentication (run
/loginfirst if you're using an API key only) - GitHub remote (
github.com) on the repo for PR mode
Not available on:
- Amazon Bedrock
- Google Cloud Vertex AI
- Microsoft Foundry
- Organizations with Zero Data Retention enabled
These are architectural exclusions. /ultrareview requires Claude.ai account auth and Anthropic's web infrastructure. No workaround exists for teams on managed cloud providers.
When to Use It
/ultrareview is pre-merge review for changes where confidence matters. It is not a full codebase audit.
Good fit:
- Large PRs over ~500 lines touching auth, payments, or infrastructure
- Security-sensitive changes where multi-agent verification matters
- Complex refactors across multiple modules (race conditions, cross-boundary type mismatches)
- Reviewing contributor or external PRs by passing a PR number
Not a good fit:
- Rapid iteration on a feature branch (5 to 20 minutes is the wrong tool for fast loops, use
/review) - Full codebase audits (scope is always the diff vs. your default branch)
- Trivial changes under 50 lines
- CI/CD pipelines (requires interactive session and Claude.ai auth)
- Bedrock, Vertex, Foundry, or ZDR environments
The Most Common Mistake
This is the post-launch confusion that produced most of the "it found nothing useful" complaints:
/ultrareview reviews the diff between your branch and the default branch. It does not scan your full existing codebase.
A finished, fully committed codebase with no recent changes has almost no diff. Pointing the command at it produces a near-empty result. That is working as designed. The tool is a pre-merge reviewer, not an auditor.
If you want to review your entire codebase, /ultrareview is the wrong tool.
Tiered Review Strategy
A practical pattern from the community (r/ClaudeCode):
| PR type | Tool | Why |
|---|---|---|
| Every PR | /review (under 5 minutes) | Always-on smoke check |
| Large or critical PRs (500+ lines, auth/payments/infra) | /ultrareview (10 to 20 minutes) | Pre-merge deep inspection |
| Infrastructure changes (DB migrations, security rewrites) | /ultrareview | Highest confidence when stakes are highest |
Think of /review as the smoke detector. Always running. Fast. /ultrareview is the inspection you call before signing off on a structural change.
Practical Tips
Run /extra-usage before your free runs expire. If billing is not configured, the feature blocks at launch with no work done.
Commit or stash before running. Branch mode bundles your working tree at the moment you confirm. Changes made after launch don't get included.
When findings appear, fix them from the notification in the same session. Letting them sit wastes the context.
For repos that can't bundle: push the branch, open a draft PR, then run /ultrareview <PR-number>. The sandbox clones from GitHub directly.
Use /tasks to track reviews running in the background. You can close the terminal. Come back to results.
What This Signals
/ultrareview is the second command in a pattern. /ultraplan (v2.1.92) moved heavy planning compute to the cloud. /ultrareview (v2.1.111) does the same for code review. Routines and remote triggers follow the same logic.
Each "ultra" prefix command offloads a heavy compute task from your local session to Anthropic-hosted infrastructure. The capability (5 to 20 parallel agents with independent verification) is only achievable in the cloud. No laptop runs 20 agents in a sandbox simultaneously.
The cost structure reflects this. Every ultra command bills separately as extra usage, outside your plan's included compute. The subscription is access. The cloud compute is metered on top.
Which model powers the agents is not publicly confirmed. Community speculation points at Opus-class agents for logic and bug hunting, with Sonnet-class for style violations. Anthropic has not confirmed the underlying models.
Bug-free merges take longer to produce. They take less time to debug in production. /ultrareview is where those two facts meet.
Frequently Asked Questions
What is /ultrareview in Claude Code?
/ultrareview is a cloud-based code review command in Claude Code. It launches a fleet of agents in a remote sandbox that fan out across your branch diff, hunt for bugs in parallel, and independently verify each finding before surfacing it. Only confirmed bugs come back. It shipped in Claude Code v2.1.111 on April 16, 2026.
How much does /ultrareview cost?
Each review costs approximately $5 to $20 depending on diff size, billed as extra usage outside your plan. Pro and Max subscribers get 3 free runs expiring May 5, 2026. Team and Enterprise have no free runs. Reviews always bill separately from your plan's included compute.
Is /ultrareview free?
Pro and Max subscribers get 3 free runs (expiring May 5, 2026, whether used or not). After those run out, every review bills as extra usage at $5 to $20 per review. Team and Enterprise plans have no free tier.
What is the difference between /review and /ultrareview?
/review runs a single-pass scan locally in your Claude Code session. /ultrareview sends your diff to a cloud sandbox, runs 5 to 20 agents in parallel, and then routes every candidate finding through an independent verification agent before surfacing it. /review takes seconds. /ultrareview takes 5 to 20 minutes. /review counts toward normal plan usage. /ultrareview bills as extra usage.
How long does /ultrareview take?
Most reviews complete in 5 to 10 minutes. Very large PRs (the mejba.me test on an 11,000-line diff) took 17 minutes. Expect up to 20 minutes on the largest diffs.
How accurate is /ultrareview?
In Anthropic's internal testing, engineers marked fewer than 1% of findings as incorrect. On PRs over 1,000 lines, 84% of reviews return findings averaging 7.5 issues each. The low false positive rate comes from the Verify stage, where separate agents try to independently reproduce each candidate bug before it surfaces.
Why did /ultrareview find nothing useful on my codebase?
/ultrareview only reviews the diff between your current branch and the default branch. It does not scan your full existing codebase. If you ran it on a finished, fully committed codebase with no recent changes, there is almost no diff to review. The tool is a pre-merge reviewer, not a codebase auditor.
Can /ultrareview review my entire codebase?
No. Scope is always the diff between your branch and the default branch. It catches bugs in code you are about to merge, not in code already on main. For a full codebase audit, you need a different approach.
Does /ultrareview work on Amazon Bedrock or Google Vertex AI?
No. /ultrareview requires Claude.ai account authentication and runs on Anthropic's web infrastructure. It is not available on Bedrock, Vertex AI, Microsoft Foundry, or organizations with Zero Data Retention enabled. There is no workaround.
What Claude Code plan do I need for /ultrareview?
Any plan can use /ultrareview as long as extra usage is enabled on the account. Pro and Max subscribers get 3 free runs. Team and Enterprise users pay per review from the start. The feature is not limited to Team or Enterprise (unlike Claude Code Review, the GitHub-integrated product).
How do I enable extra usage for /ultrareview?
Run /extra-usage in Claude Code. If extra usage is not already enabled, that command links you to billing settings. You must enable it before running /ultrareview. The feature blocks at launch if extra usage is off, and you cannot enable it from the confirmation dialog mid-flow.
Can /ultrareview run in CI/CD pipelines?
No. It requires an interactive Claude Code session with Claude.ai account authentication. Automated pipeline runs are not supported.
What happens if I stop /ultrareview mid-run?
The cloud session archives and you get zero partial findings. If a 17-minute review is at the 10-minute mark when you stop it, nothing comes back. Let it finish.
How many agents does /ultrareview use?
The default fleet is 5 agents. Configuration supports up to 20, though whether higher fleet sizes are user-configurable or tier-restricted is not publicly documented.
Can I review someone else's pull request with /ultrareview?
Yes. Pass the GitHub PR number: /ultrareview 1234. The remote sandbox clones the PR directly from GitHub. This works for any GitHub PR you can access, including contributor PRs and open-source repos you maintain.
What does /ultrareview catch that /review misses?
Multi-agent parallel exploration catches bugs that require holding multiple parts of a diff in context at once: race conditions across module boundaries, type mismatches that only matter when two changed files interact, logic errors in control flow spanning multiple functions. A single-pass agent reads the diff sequentially. Five agents explore it from different angles simultaneously, then cross-check each other.
Do I need a GitHub account to use /ultrareview?
For branch mode (bare /ultrareview), no GitHub account is needed. Claude Code bundles your local working tree. For PR mode (/ultrareview 1234), a github.com remote on the repository is required.
Why do Pro and Max subscribers only get 3 free runs?
Anthropic has not given an official explanation. Community speculation (r/claude) points at unusually high backend compute cost, with some theories suggesting the agents run on a frontier model not yet publicly released. The 3-run cap even for Max ($200/month) subscribers is widely noted as unusual. Pricing and availability are explicitly labeled as subject to change since the feature is a research preview.
What version of Claude Code do I need?
Claude Code v2.1.111 or later. Run claude --version to check. The docs list v2.1.86 as the minimum, but the feature was introduced in v2.1.111.
Does /ultrareview work with uncommitted changes?
Yes. Branch mode bundles your full working tree at the moment you confirm, including staged and unstaged changes. Changes made after you confirm the launch are not included in that review.
Stop configuring. Start building.
SaaS builder templates with AI orchestration.