Build This Now
Build This Now
What Is Claude Code?Claude Code InstallationClaude Code Native InstallerYour First Claude Code Project
The Ralph Wiggum TechniqueThread-Based EngineeringAutonomous Claude CodeRobots-First EngineeringClaude Code /simplify and /batch
Get Build This Now
speedy_devvkoen_salo
Blog/Handbook/Core/The Ralph Wiggum Technique

The Ralph Wiggum Technique

Stop hooks, completion promises, and verification-first workflows that let Claude Code ship features while you sleep.

Update (Jan 2025): Anthropic shipped native task management with dependencies, blockers, and multi-session coordination through CLAUDE_CODE_TASK_LIST_ID. Many Ralph workarounds are built into the product now. The core principles below still hold. The new system just handles the plumbing natively.

Hand an agent a task list. It grabs one, writes the code, runs the tests, commits. Then it grabs the next. And the next. The whole thing runs while you're asleep.

That's Ralph Wiggum. No relation to the Simpsons kid. It's the autonomous coding loop that's quietly reshaping how engineers ship software.

What Makes Ralph Different

Most people drive Claude Code like a chat app. Prompt. Wait. Read. Prompt again. Fine for quick jobs. For shipping actual features though, you become the slow part.

Ralph flips that around. Instead of steering every turn, you build a loop that keeps Claude going until the work is done. The trick sits inside Claude Code's stop hooks: they fire the moment the agent tries to wrap up, which means you can catch that attempt and shove the agent back to work.

Here's the core pattern:

  1. Claude works on a task
  2. Claude tries to stop (outputs completion)
  3. A stop hook intercepts and checks: is the work actually done?
  4. If not, feed the prompt back and continue
  5. If yes, let it complete

Step 4 is everything. Your agent doesn't quit the first time it thinks it's finished. It quits once the work has been verified.

The Completion Promise

Ralph leans on a "completion promise". A specific word or phrase that means actually done. When Claude is convinced the task is wrapped, it emits that promise (usually just the word "complete").

// In your Ralph loop configuration
completion_promise: "complete"
max_iterations: 25

Every time Claude tries to stop, the hook scans for that promise. Missing? Loop keeps going. Present? Loop ends. Premature exits get blocked, and real exits get through cleanly.

Critical rule: No promise, no stop. That forces the agent to keep going until it genuinely thinks the work is done.

Verification: The Non-Negotiable Core

Boris Cherny, who created Claude Code, has one rule he refuses to break: always give Claude a way to verify its work.

That rule is why Ralph works at all. Skip verification and you end up with a loop that either runs forever or stops far too soon. Add it and the loop actually knows when it's finished.

Three verification approaches pair well with Ralph:

1. Test-Driven Verification

Write the tests first. Claude runs them, watches them fail, writes code, runs them again. The loop keeps looping until everything is green.

Workflow:
1. Run all tests in /tests/feature-x/
2. If tests fail, implement code to make them pass
3. Run tests again
4. Repeat until all tests pass
5. Output "complete" only when test suite is green

This is the most reliable path. Tests don't lie. Pass or fail. Nothing fuzzy.

2. Background Agent Verification

Kick off a second agent whose only job is to check the main agent's work. Boris uses this for long runs:

After completing work, use a background agent to:
1. Review all changed files
2. Run the full test suite
3. Check for regressions
4. Report any issues found

You get an independent check. If the background agent spots problems, the main loop goes right back to work.

3. Stop Hook Validation

The stop hook itself can run validation. Check a progress file, run the linter, verify the build. Validation fails? Block the stop and send the agent back in.

// Stop hook pseudocode
if (agent_trying_to_stop) {
  validation_result = run_tests();
  if (validation_result.failed) {
    return { decision: "block", reason: "Tests failing, continue work" };
  }
  return { decision: "allow" };
}

The Two-Phase Workflow

First mistake most people make: they plan and implement in the same context window.

Split them apart.

Phase 1: Planning Session

  • Generate specifications through conversation
  • Review and edit by hand
  • Create an implementation plan with explicit file references
  • Keep the spec as a "pin" that prevents invention

Phase 2: Implementation Session

  • Fresh context (clear the previous conversation)
  • Feed only the plan document
  • Run the Ralph loop
  • Let the agent iterate until complete

Why the split? Because context window degradation is real. After enough back-and-forth, Claude starts leaning on stale messages from earlier. A clean start with just the plan means the focus stays tight.

Your plan becomes the anchor. Every iteration of the loop looks back at it. That's what keeps the agent from drifting off into something you didn't ask for.

Practical Implementation: The PRD Approach

Ryan Carson's version looks like this:

  1. Start with a PRD (Product Requirements Document)
  • What are we building?
  • What's in scope?
  • What's explicitly out of scope?
  1. Convert to user stories with acceptance criteria
  • Each story is a small, testable unit
  • Acceptance criteria define "done"
  1. Structure for agent consumption
  • JSON or markdown format
  • Clear checkboxes for progress tracking
  • Links to relevant code locations
  1. Run the loop
  • Agent picks the next uncompleted story
  • Implements it
  • Runs verification (tests)
  • Marks it complete
  • Moves to the next

Here's the payoff: you just walk away. Wake up to finished features, green tests, and commits already in the log.

UI Verification: The Hidden Trap

A gotcha that bites everyone sooner or later: the tests go green, but the UI looks wrong.

Here's why. Ralph can happily confirm the code runs while staying totally blind to visual bugs. The component renders, the tests pass, and the button is still off-screen or the text is cut in half.

Fix it with a screenshot-based verification protocol.

After implementing UI changes:
1. Take screenshots of affected components
2. Rename each with "verified_" prefix after review
3. Do NOT output completion promise yet
4. Let the next iteration confirm all files are verified
5. Only then output "complete"

That forces at least two loop passes for any UI change. Pass one implements and captures screenshots. Pass two confirms every screenshot got reviewed. The visual check can't be skipped.

The key insight: Instruct Claude that renaming the screenshots does NOT earn the completion promise. The next iteration is what signals done. That blocks the premature exits.

Economics: Why This Changes Everything

A coding agent running nonstop on Sonnet costs roughly $10.42 USD per hour (measured across a 24-hour burn rate).

Less than minimum wage in most places. And you're paying for a machine that can:

  • Clear backlogs overnight
  • Run multiple features in parallel
  • Never get tired or distracted
  • Scale with more compute

So the bottleneck shifts. It stops being "how much am I willing to spend?" and becomes "how much reliable work can I define?"

Teams running reliable loops will pull way ahead of teams that aren't. The gap is already widening.

Common Failures and Fixes

Loop Never Ends

Cause: Impossible task or missing completion criteria Fix: Set a max iteration count (e.g., 25). Add explicit completion criteria to your prompt.

Loop Ends Too Early

Cause: Claude outputs the promise before work is done Fix: Strengthen your verification. Add tests. Use the screenshot protocol for UI. Make "done" objectively measurable.

Quality Degrades Over Iterations

Cause: Context window filling with failed attempts Fix: Implement checkpoint state. Mark completed work in an external file. Let the loop resume cleanly if context fills.

Agent Invents Features

Cause: Spec is vague or missing Fix: Your spec is the "pin" that prevents invention. Make it specific. Include explicit references to existing code. Tell Claude what NOT to do.

Setting Up Your First Ralph Loop

Keep it small on the first run. Pick a feature you know well, with tests that already exist.

  1. Install the Ralph plugin (or implement the stop hook pattern yourself)

  2. Create your prompt file:

Study the implementation plan in /docs/plan.md
Pick the single most important incomplete task
Implement it following existing patterns
Run tests with: npm test
On pass: mark task complete in plan.md, commit changes
On fail: fix the issue and run tests again
Output "complete" only when all tasks are done and tests pass
  1. Set constraints:
  • Max iterations: 25
  • Completion promise: "complete"
  • Quality gates: tests must pass, linting must pass
  1. Watch the first run. Don't walk away yet. Cancel if behavior looks wrong. Adjust your prompt. Re-run.

  2. Gradually increase autonomy as trust builds.

The Ralph Philosophy

Ralph is not about cutting humans out of coding. It's about cutting humans out of the tedious loop between attempts.

The design is still yours. The specs are still yours. You define what "done" means. You review the final result.

What Ralph takes is the 2 AM debugging slog. The endless test-fix-test grind. The switching between features. That's the stuff it handles.

Boris keeps coming back to the same line: verification drives everything. Give Claude a way to check its own work, and it'll run reliably for hours. Take that away and you're gambling.

Start with verification. Wrap your loops around it. The autonomous coding future isn't smarter prompts. It's better feedback systems.

Next Steps

  • Try native task management for built-in persistence and multi-session coordination
  • Learn about hooks to implement custom stop behaviors
  • Explore async workflows for running multiple loops
  • Read about thread-based engineering for scaling your autonomous workflows
  • Check feedback loops for verification patterns

People who get good at Ralph aren't just Claude Code users. They're building systems that ship code while they sleep.

More in this guide

  • Agent Fundamentals
    Five ways to build specialized agents in Claude Code, from sub-agents to .claude/agents/ definitions to perspective prompts.
  • Agent Patterns
    Orchestrator, fan-out, validation chain, specialist routing, progressive refinement, and watchdog. Six ways to wire sub-agents in Claude Code.
  • Agent Teams Best Practices
    Battle-tested patterns for Claude Code agent teams. Troubleshooting, limitations, plan mode quirks, and fixes shipped from v2.1.33 through v2.1.45.
  • Agent Teams Controls
    Stop your agent team lead from grabbing implementation work. Configure delegate mode, plan approval, hooks, and CLAUDE.md for teams.
  • Agent Teams Prompt Templates
    Ten tested Agent Teams prompts for Claude Code. Code review, debugging, feature builds, architecture calls, and campaign research. Paste and go.

Stop configuring. Start building.

SaaS builder templates with AI orchestration.

Get Build This Now

Claude Code Routines

Saved prompts that run on Anthropic's cloud, triggered by a schedule, an API call, or a GitHub event, with zero local dependencies.

Thread-Based Engineering

A framework for measuring AI-assisted engineering. Six thread patterns cover the workflows: base, P, C, F, B, L.

On this page

What Makes Ralph Different
The Completion Promise
Verification: The Non-Negotiable Core
1. Test-Driven Verification
2. Background Agent Verification
3. Stop Hook Validation
The Two-Phase Workflow
Practical Implementation: The PRD Approach
UI Verification: The Hidden Trap
Economics: Why This Changes Everything
Common Failures and Fixes
Loop Never Ends
Loop Ends Too Early
Quality Degrades Over Iterations
Agent Invents Features
Setting Up Your First Ralph Loop
The Ralph Philosophy
Next Steps

Stop configuring. Start building.

SaaS builder templates with AI orchestration.

Get Build This Now