The Ralph Wiggum Technique
Stop hooks, completion promises, and verification-first workflows that let Claude Code ship features while you sleep.
Update (Jan 2025): Anthropic shipped native task management with dependencies, blockers, and multi-session coordination through CLAUDE_CODE_TASK_LIST_ID. Many Ralph workarounds are built into the product now. The core principles below still hold. The new system just handles the plumbing natively.
Hand an agent a task list. It grabs one, writes the code, runs the tests, commits. Then it grabs the next. And the next. The whole thing runs while you're asleep.
That's Ralph Wiggum. No relation to the Simpsons kid. It's the autonomous coding loop that's quietly reshaping how engineers ship software.
What Makes Ralph Different
Most people drive Claude Code like a chat app. Prompt. Wait. Read. Prompt again. Fine for quick jobs. For shipping actual features though, you become the slow part.
Ralph flips that around. Instead of steering every turn, you build a loop that keeps Claude going until the work is done. The trick sits inside Claude Code's stop hooks: they fire the moment the agent tries to wrap up, which means you can catch that attempt and shove the agent back to work.
Here's the core pattern:
- Claude works on a task
- Claude tries to stop (outputs completion)
- A stop hook intercepts and checks: is the work actually done?
- If not, feed the prompt back and continue
- If yes, let it complete
Step 4 is everything. Your agent doesn't quit the first time it thinks it's finished. It quits once the work has been verified.
The Completion Promise
Ralph leans on a "completion promise". A specific word or phrase that means actually done. When Claude is convinced the task is wrapped, it emits that promise (usually just the word "complete").
// In your Ralph loop configuration
completion_promise: "complete"
max_iterations: 25Every time Claude tries to stop, the hook scans for that promise. Missing? Loop keeps going. Present? Loop ends. Premature exits get blocked, and real exits get through cleanly.
Critical rule: No promise, no stop. That forces the agent to keep going until it genuinely thinks the work is done.
Verification: The Non-Negotiable Core
Boris Cherny, who created Claude Code, has one rule he refuses to break: always give Claude a way to verify its work.
That rule is why Ralph works at all. Skip verification and you end up with a loop that either runs forever or stops far too soon. Add it and the loop actually knows when it's finished.
Three verification approaches pair well with Ralph:
1. Test-Driven Verification
Write the tests first. Claude runs them, watches them fail, writes code, runs them again. The loop keeps looping until everything is green.
Workflow:
1. Run all tests in /tests/feature-x/
2. If tests fail, implement code to make them pass
3. Run tests again
4. Repeat until all tests pass
5. Output "complete" only when test suite is greenThis is the most reliable path. Tests don't lie. Pass or fail. Nothing fuzzy.
2. Background Agent Verification
Kick off a second agent whose only job is to check the main agent's work. Boris uses this for long runs:
After completing work, use a background agent to:
1. Review all changed files
2. Run the full test suite
3. Check for regressions
4. Report any issues foundYou get an independent check. If the background agent spots problems, the main loop goes right back to work.
3. Stop Hook Validation
The stop hook itself can run validation. Check a progress file, run the linter, verify the build. Validation fails? Block the stop and send the agent back in.
// Stop hook pseudocode
if (agent_trying_to_stop) {
validation_result = run_tests();
if (validation_result.failed) {
return { decision: "block", reason: "Tests failing, continue work" };
}
return { decision: "allow" };
}The Two-Phase Workflow
First mistake most people make: they plan and implement in the same context window.
Split them apart.
Phase 1: Planning Session
- Generate specifications through conversation
- Review and edit by hand
- Create an implementation plan with explicit file references
- Keep the spec as a "pin" that prevents invention
Phase 2: Implementation Session
- Fresh context (clear the previous conversation)
- Feed only the plan document
- Run the Ralph loop
- Let the agent iterate until complete
Why the split? Because context window degradation is real. After enough back-and-forth, Claude starts leaning on stale messages from earlier. A clean start with just the plan means the focus stays tight.
Your plan becomes the anchor. Every iteration of the loop looks back at it. That's what keeps the agent from drifting off into something you didn't ask for.
Practical Implementation: The PRD Approach
Ryan Carson's version looks like this:
- Start with a PRD (Product Requirements Document)
- What are we building?
- What's in scope?
- What's explicitly out of scope?
- Convert to user stories with acceptance criteria
- Each story is a small, testable unit
- Acceptance criteria define "done"
- Structure for agent consumption
- JSON or markdown format
- Clear checkboxes for progress tracking
- Links to relevant code locations
- Run the loop
- Agent picks the next uncompleted story
- Implements it
- Runs verification (tests)
- Marks it complete
- Moves to the next
Here's the payoff: you just walk away. Wake up to finished features, green tests, and commits already in the log.
UI Verification: The Hidden Trap
A gotcha that bites everyone sooner or later: the tests go green, but the UI looks wrong.
Here's why. Ralph can happily confirm the code runs while staying totally blind to visual bugs. The component renders, the tests pass, and the button is still off-screen or the text is cut in half.
Fix it with a screenshot-based verification protocol.
After implementing UI changes:
1. Take screenshots of affected components
2. Rename each with "verified_" prefix after review
3. Do NOT output completion promise yet
4. Let the next iteration confirm all files are verified
5. Only then output "complete"That forces at least two loop passes for any UI change. Pass one implements and captures screenshots. Pass two confirms every screenshot got reviewed. The visual check can't be skipped.
The key insight: Instruct Claude that renaming the screenshots does NOT earn the completion promise. The next iteration is what signals done. That blocks the premature exits.
Economics: Why This Changes Everything
A coding agent running nonstop on Sonnet costs roughly $10.42 USD per hour (measured across a 24-hour burn rate).
Less than minimum wage in most places. And you're paying for a machine that can:
- Clear backlogs overnight
- Run multiple features in parallel
- Never get tired or distracted
- Scale with more compute
So the bottleneck shifts. It stops being "how much am I willing to spend?" and becomes "how much reliable work can I define?"
Teams running reliable loops will pull way ahead of teams that aren't. The gap is already widening.
Common Failures and Fixes
Loop Never Ends
Cause: Impossible task or missing completion criteria Fix: Set a max iteration count (e.g., 25). Add explicit completion criteria to your prompt.
Loop Ends Too Early
Cause: Claude outputs the promise before work is done Fix: Strengthen your verification. Add tests. Use the screenshot protocol for UI. Make "done" objectively measurable.
Quality Degrades Over Iterations
Cause: Context window filling with failed attempts Fix: Implement checkpoint state. Mark completed work in an external file. Let the loop resume cleanly if context fills.
Agent Invents Features
Cause: Spec is vague or missing Fix: Your spec is the "pin" that prevents invention. Make it specific. Include explicit references to existing code. Tell Claude what NOT to do.
Setting Up Your First Ralph Loop
Keep it small on the first run. Pick a feature you know well, with tests that already exist.
-
Install the Ralph plugin (or implement the stop hook pattern yourself)
-
Create your prompt file:
Study the implementation plan in /docs/plan.md
Pick the single most important incomplete task
Implement it following existing patterns
Run tests with: npm test
On pass: mark task complete in plan.md, commit changes
On fail: fix the issue and run tests again
Output "complete" only when all tasks are done and tests pass- Set constraints:
- Max iterations: 25
- Completion promise: "complete"
- Quality gates: tests must pass, linting must pass
-
Watch the first run. Don't walk away yet. Cancel if behavior looks wrong. Adjust your prompt. Re-run.
-
Gradually increase autonomy as trust builds.
The Ralph Philosophy
Ralph is not about cutting humans out of coding. It's about cutting humans out of the tedious loop between attempts.
The design is still yours. The specs are still yours. You define what "done" means. You review the final result.
What Ralph takes is the 2 AM debugging slog. The endless test-fix-test grind. The switching between features. That's the stuff it handles.
Boris keeps coming back to the same line: verification drives everything. Give Claude a way to check its own work, and it'll run reliably for hours. Take that away and you're gambling.
Start with verification. Wrap your loops around it. The autonomous coding future isn't smarter prompts. It's better feedback systems.
Next Steps
- Try native task management for built-in persistence and multi-session coordination
- Learn about hooks to implement custom stop behaviors
- Explore async workflows for running multiple loops
- Read about thread-based engineering for scaling your autonomous workflows
- Check feedback loops for verification patterns
People who get good at Ralph aren't just Claude Code users. They're building systems that ship code while they sleep.
Stop configuring. Start building.