Build This Now
Build This Now
speedy_devvkoen_salo
Blog/Handbook/Workflow/Claude Code Ultra Review

Claude Code Ultra Review

A fleet of cloud agents fans out across your PR diff, independently verifies every finding, and surfaces only real bugs. What /ultrareview does, when to use it, and what it costs.

Stop configuring. Start building.

SaaS builder templates with AI orchestration.

Published Apr 22, 202611 min readHandbook hubWorkflow index

Problem: Code review is a numbers game with bad odds. Human reviewers catch obvious mistakes, but large diffs wear attention down fast. A 1,500-line refactor touching auth, encryption, and three database tables? The subtle type mismatch on line 847 stays quiet. It ships.

/review helps. A single-pass scan in your local session catches a lot. But one agent, one pass, no verification step. Findings are good. Confidence is lower.

/ultrareview runs differently. It sends your diff to a cloud sandbox, spins up a fleet of agents, and runs each finding through an independent verification pass before anything surfaces. Only confirmed bugs come back.

What /ultrareview Actually Does

The command ships in Claude Code v2.1.111, launched April 16, 2026, as a research preview. Two invocation modes:

Branch mode reviews the diff between your current branch and the default branch, including anything uncommitted or staged:

/ultrareview

PR mode takes a GitHub PR number. The remote sandbox clones directly from GitHub instead of bundling your local working tree:

/ultrareview 1234

Use PR mode when your repo is too large to bundle. Push the branch, open a draft PR, then pass the number.

The Four-Stage Pipeline

Every review runs through the same four stages in the cloud sandbox:

Setup: Anthropic provisions remote infrastructure and spins up a fleet of sub-agents. Default fleet is 5 agents. This takes roughly 90 seconds.

Find: Agents explore different execution paths through the changed code in parallel. Each one hunts independently, so race conditions, logic errors, and cross-module type mismatches get pressure from multiple angles at once.

Verify: A separate set of agents tries to reproduce each candidate finding. A bug that only one agent flagged gets challenged by another. If it can't be confirmed independently, it doesn't surface.

Dedup: Duplicate findings from different agents get merged into single ranked reports.

The confirmation dialog before launch shows the scope: file count, line count, remaining free runs, and estimated cost. The review only starts after you confirm.

Reviews run as background tasks. Your terminal stays free. Track progress with /tasks. Closing the terminal is fine. Stopping a review mid-run archives the cloud session and returns zero partial findings.

How It Compares to /review

/review/ultrareview
Runslocally, in-sessionremote cloud sandbox
Depthsingle-passmulti-agent fleet plus independent verification
Durationseconds to a few minutes5 to 10 minutes (up to 20 on large PRs)
Costcounts toward normal usagefree runs, then $5 to $20/review as extra usage
Best forfast feedback while iteratingpre-merge confidence on substantial changes

The defining difference is the verification stage. /review is one pass. /ultrareview surfaces only findings that survived a second agent trying to reproduce them. That is where the under-1% false positive rate comes from.

Results Worth Knowing

Anthropic ran this on their own PRs before launch. The numbers from internal testing (via claudefa.st):

MetricResult
Large PRs (1,000+ lines) with findings84%, averaging 7.5 issues per review
Small PRs (under 50 lines) with findings31%, averaging 0.5 issues
Findings marked incorrect by engineersUnder 1%
PRs with substantive review comments (before vs after)16% to 54%

Two examples from real use. A one-line authentication change at Anthropic would have silently broken login flows. /ultrareview flagged it as critical before merge. In a TrueNAS ZFS encryption refactor, it surfaced a type mismatch that was wiping the encryption key cache on every sync. The kind of bug that lives in production for months before someone traces the intermittent failures back to the right commit.

A practitioner test on an 11,000-line voice calling PR (via mejba.me): 64 candidate bugs in the Find stage, a smaller confirmed set after Verify, 17 minutes total. Race conditions and state management issues across module boundaries. Things a single-pass review misses because no one agent is simultaneously holding the full picture.

Pricing

PlanFree runsAfter free runs
Pro3 (expire May 5, 2026)billed as extra usage
Max3 (expire May 5, 2026)billed as extra usage
Teamnonebilled as extra usage
Enterprisenonebilled as extra usage

Each review costs approximately $5 to $20 depending on diff size. Free runs are a one-time allotment. They expire May 5, 2026 whether used or not. No renewal, no carry-over.

Before you run it: confirm extra usage is enabled on your account. Run /extra-usage to check. If extra usage is off, the feature blocks at launch. You cannot enable it from the confirmation dialog.

Platform Requirements

Required:

  • Claude Code v2.1.111 or later
  • Claude.ai account authentication (run /login first if you're using an API key only)
  • GitHub remote (github.com) on the repo for PR mode

Not available on:

  • Amazon Bedrock
  • Google Cloud Vertex AI
  • Microsoft Foundry
  • Organizations with Zero Data Retention enabled

These are architectural exclusions. /ultrareview requires Claude.ai account auth and Anthropic's web infrastructure. No workaround exists for teams on managed cloud providers.

When to Use It

/ultrareview is pre-merge review for changes where confidence matters. It is not a full codebase audit.

Good fit:

  • Large PRs over ~500 lines touching auth, payments, or infrastructure
  • Security-sensitive changes where multi-agent verification matters
  • Complex refactors across multiple modules (race conditions, cross-boundary type mismatches)
  • Reviewing contributor or external PRs by passing a PR number

Not a good fit:

  • Rapid iteration on a feature branch (5 to 20 minutes is the wrong tool for fast loops, use /review)
  • Full codebase audits (scope is always the diff vs. your default branch)
  • Trivial changes under 50 lines
  • CI/CD pipelines (requires interactive session and Claude.ai auth)
  • Bedrock, Vertex, Foundry, or ZDR environments

The Most Common Mistake

This is the post-launch confusion that produced most of the "it found nothing useful" complaints:

/ultrareview reviews the diff between your branch and the default branch. It does not scan your full existing codebase.

A finished, fully committed codebase with no recent changes has almost no diff. Pointing the command at it produces a near-empty result. That is working as designed. The tool is a pre-merge reviewer, not an auditor.

If you want to review your entire codebase, /ultrareview is the wrong tool.

Tiered Review Strategy

A practical pattern from the community (r/ClaudeCode):

PR typeToolWhy
Every PR/review (under 5 minutes)Always-on smoke check
Large or critical PRs (500+ lines, auth/payments/infra)/ultrareview (10 to 20 minutes)Pre-merge deep inspection
Infrastructure changes (DB migrations, security rewrites)/ultrareviewHighest confidence when stakes are highest

Think of /review as the smoke detector. Always running. Fast. /ultrareview is the inspection you call before signing off on a structural change.

Practical Tips

Run /extra-usage before your free runs expire. If billing is not configured, the feature blocks at launch with no work done.

Commit or stash before running. Branch mode bundles your working tree at the moment you confirm. Changes made after launch don't get included.

When findings appear, fix them from the notification in the same session. Letting them sit wastes the context.

For repos that can't bundle: push the branch, open a draft PR, then run /ultrareview <PR-number>. The sandbox clones from GitHub directly.

Use /tasks to track reviews running in the background. You can close the terminal. Come back to results.

What This Signals

/ultrareview is the second command in a pattern. /ultraplan (v2.1.92) moved heavy planning compute to the cloud. /ultrareview (v2.1.111) does the same for code review. Routines and remote triggers follow the same logic.

Each "ultra" prefix command offloads a heavy compute task from your local session to Anthropic-hosted infrastructure. The capability (5 to 20 parallel agents with independent verification) is only achievable in the cloud. No laptop runs 20 agents in a sandbox simultaneously.

The cost structure reflects this. Every ultra command bills separately as extra usage, outside your plan's included compute. The subscription is access. The cloud compute is metered on top.

Which model powers the agents is not publicly confirmed. Community speculation points at Opus-class agents for logic and bug hunting, with Sonnet-class for style violations. Anthropic has not confirmed the underlying models.

Bug-free merges take longer to produce. They take less time to debug in production. /ultrareview is where those two facts meet.

Frequently Asked Questions

What is /ultrareview in Claude Code?

/ultrareview is a cloud-based code review command in Claude Code. It launches a fleet of agents in a remote sandbox that fan out across your branch diff, hunt for bugs in parallel, and independently verify each finding before surfacing it. Only confirmed bugs come back. It shipped in Claude Code v2.1.111 on April 16, 2026.

How much does /ultrareview cost?

Each review costs approximately $5 to $20 depending on diff size, billed as extra usage outside your plan. Pro and Max subscribers get 3 free runs expiring May 5, 2026. Team and Enterprise have no free runs. Reviews always bill separately from your plan's included compute.

Is /ultrareview free?

Pro and Max subscribers get 3 free runs (expiring May 5, 2026, whether used or not). After those run out, every review bills as extra usage at $5 to $20 per review. Team and Enterprise plans have no free tier.

What is the difference between /review and /ultrareview?

/review runs a single-pass scan locally in your Claude Code session. /ultrareview sends your diff to a cloud sandbox, runs 5 to 20 agents in parallel, and then routes every candidate finding through an independent verification agent before surfacing it. /review takes seconds. /ultrareview takes 5 to 20 minutes. /review counts toward normal plan usage. /ultrareview bills as extra usage.

How long does /ultrareview take?

Most reviews complete in 5 to 10 minutes. Very large PRs (the mejba.me test on an 11,000-line diff) took 17 minutes. Expect up to 20 minutes on the largest diffs.

How accurate is /ultrareview?

In Anthropic's internal testing, engineers marked fewer than 1% of findings as incorrect. On PRs over 1,000 lines, 84% of reviews return findings averaging 7.5 issues each. The low false positive rate comes from the Verify stage, where separate agents try to independently reproduce each candidate bug before it surfaces.

Why did /ultrareview find nothing useful on my codebase?

/ultrareview only reviews the diff between your current branch and the default branch. It does not scan your full existing codebase. If you ran it on a finished, fully committed codebase with no recent changes, there is almost no diff to review. The tool is a pre-merge reviewer, not a codebase auditor.

Can /ultrareview review my entire codebase?

No. Scope is always the diff between your branch and the default branch. It catches bugs in code you are about to merge, not in code already on main. For a full codebase audit, you need a different approach.

Does /ultrareview work on Amazon Bedrock or Google Vertex AI?

No. /ultrareview requires Claude.ai account authentication and runs on Anthropic's web infrastructure. It is not available on Bedrock, Vertex AI, Microsoft Foundry, or organizations with Zero Data Retention enabled. There is no workaround.

What Claude Code plan do I need for /ultrareview?

Any plan can use /ultrareview as long as extra usage is enabled on the account. Pro and Max subscribers get 3 free runs. Team and Enterprise users pay per review from the start. The feature is not limited to Team or Enterprise (unlike Claude Code Review, the GitHub-integrated product).

How do I enable extra usage for /ultrareview?

Run /extra-usage in Claude Code. If extra usage is not already enabled, that command links you to billing settings. You must enable it before running /ultrareview. The feature blocks at launch if extra usage is off, and you cannot enable it from the confirmation dialog mid-flow.

Can /ultrareview run in CI/CD pipelines?

No. It requires an interactive Claude Code session with Claude.ai account authentication. Automated pipeline runs are not supported.

What happens if I stop /ultrareview mid-run?

The cloud session archives and you get zero partial findings. If a 17-minute review is at the 10-minute mark when you stop it, nothing comes back. Let it finish.

How many agents does /ultrareview use?

The default fleet is 5 agents. Configuration supports up to 20, though whether higher fleet sizes are user-configurable or tier-restricted is not publicly documented.

Can I review someone else's pull request with /ultrareview?

Yes. Pass the GitHub PR number: /ultrareview 1234. The remote sandbox clones the PR directly from GitHub. This works for any GitHub PR you can access, including contributor PRs and open-source repos you maintain.

What does /ultrareview catch that /review misses?

Multi-agent parallel exploration catches bugs that require holding multiple parts of a diff in context at once: race conditions across module boundaries, type mismatches that only matter when two changed files interact, logic errors in control flow spanning multiple functions. A single-pass agent reads the diff sequentially. Five agents explore it from different angles simultaneously, then cross-check each other.

Do I need a GitHub account to use /ultrareview?

For branch mode (bare /ultrareview), no GitHub account is needed. Claude Code bundles your local working tree. For PR mode (/ultrareview 1234), a github.com remote on the repository is required.

Why do Pro and Max subscribers only get 3 free runs?

Anthropic has not given an official explanation. Community speculation (r/claude) points at unusually high backend compute cost, with some theories suggesting the agents run on a frontier model not yet publicly released. The 3-run cap even for Max ($200/month) subscribers is widely noted as unusual. Pricing and availability are explicitly labeled as subject to change since the feature is a research preview.

What version of Claude Code do I need?

Claude Code v2.1.111 or later. Run claude --version to check. The docs list v2.1.86 as the minimum, but the feature was introduced in v2.1.111.

Does /ultrareview work with uncommitted changes?

Yes. Branch mode bundles your full working tree at the moment you confirm, including staged and unstaged changes. Changes made after you confirm the launch are not included in that review.

Continue in Workflow

  • Claude Code Best Practices
    Five habits separate engineers who ship with Claude Code: PRDs, modular CLAUDE.md rules, custom slash commands, /clear resets, and a system-evolution mindset.
  • Claude Code Auto Mode
    A second Sonnet model reviews every Claude Code tool call before it fires. What auto mode blocks, what it allows, and the allow rules it drops in your settings.
  • Claude Code Channels
    Plug Claude Code into Telegram, Discord, or iMessage with plugin MCP servers. Setup walkthroughs and the async mobile workflows that make it worth wiring up.
  • Claude Opus 4.7 Best Practices
    Use Claude Opus 4.7 well in Claude Code: first turns, effort settings, adaptive thinking, tool prompting, subagents, session resets, and token control.
  • Claude Code Review
    Parallel Claude agents hunt bugs on every PR, cross-check findings, and post one high-signal comment. What it catches, what it costs, how to enable it.
  • Feedback Loops
    Hand Claude Code one prompt that writes code, runs your test or dev command, reads the output, fixes whatever breaks, and loops until the suite is green.

More from Handbook

  • Agent Fundamentals
    Five ways to build specialist agents in Claude Code: Task sub-agents, .claude/agents YAML, custom slash commands, CLAUDE.md personas, and perspective prompts.
  • Agent Harness Engineering
    The harness is every layer around your AI agent except the model itself. Learn the five control levers, the constraint paradox, and why harness design determines agent performance more than the model does.
  • Agent Patterns
    Orchestrator, fan-out, validation chain, specialist routing, progressive refinement, and watchdog. Six orchestration shapes to wire Claude Code sub-agents with.
  • Agent Teams Best Practices
    Battle-tested patterns for Claude Code Agent Teams. Context-rich spawn prompts, right-sized tasks, file ownership, delegate mode, and v2.1.33-v2.1.45 fixes.

Stop configuring. Start building.

SaaS builder templates with AI orchestration.

On this page

What /ultrareview Actually Does
The Four-Stage Pipeline
How It Compares to /review
Results Worth Knowing
Pricing
Platform Requirements
When to Use It
The Most Common Mistake
Tiered Review Strategy
Practical Tips
What This Signals
Frequently Asked Questions
What is /ultrareview in Claude Code?
How much does /ultrareview cost?
Is /ultrareview free?
What is the difference between /review and /ultrareview?
How long does /ultrareview take?
How accurate is /ultrareview?
Why did /ultrareview find nothing useful on my codebase?
Can /ultrareview review my entire codebase?
Does /ultrareview work on Amazon Bedrock or Google Vertex AI?
What Claude Code plan do I need for /ultrareview?
How do I enable extra usage for /ultrareview?
Can /ultrareview run in CI/CD pipelines?
What happens if I stop /ultrareview mid-run?
How many agents does /ultrareview use?
Can I review someone else's pull request with /ultrareview?
What does /ultrareview catch that /review misses?
Do I need a GitHub account to use /ultrareview?
Why do Pro and Max subscribers only get 3 free runs?
What version of Claude Code do I need?
Does /ultrareview work with uncommitted changes?

Stop configuring. Start building.

SaaS builder templates with AI orchestration.