How to Fix Claude Code Running Out of Context
Claude Code running out of context is a session design problem. Fix it with /compact, lean CLAUDE.md, skills, and subagents, not a bigger window.
設定をやめて、構築を始めよう。
AIオーケストレーション付きSaaSビルダーテンプレート。
Claude Code runs out of context because every file it reads, every command it runs, and every tool result it sees piles into one shared memory buffer for the session, and that buffer fills whether you notice or not. The durable fix is not a bigger window. It is designing the session so that buildup never spirals: run /compact early, keep your CLAUDE.md short, load knowledge through skills, and push file-heavy work into subagents that have their own separate memory.
設定をやめて、構築を始めよう。
AIオーケストレーション付きSaaSビルダーテンプレート。
Why this matters to you
If Claude Code keeps forgetting what you told it ten minutes ago, you lose time re-explaining and you get worse code. The cause is mechanical, not mysterious. Once you understand what eats the window, you can stop the bleeding with a few habits that cost almost nothing.
What actually fills the context window
The context window is the total amount of text Claude can hold in mind at once, measured in tokens (a token is roughly three-quarters of a word). People assume chat messages fill it. They do not, at least not mostly.
What fills it is everything else:
- File reads. Open a 600-line file and all 600 lines sit in the window.
- Tool outputs. A test run, a build log, a long
grepresult: all of it stays. - Command results. Every
ls, every diff, every stack trace adds up.
There is no selective memory. Claude Code keeps one flat buffer per session. It cannot quietly drop the file it no longer needs and keep the function you care about. The window just fills, silently, with every tool call. That is why a long session feels sharp at the start and foggy by the end.
The auto-compaction mechanic, stated honestly
Claude Code has a built-in cleanup step called auto-compaction. When the session gets near full, it summarizes older content to make room. Reported behavior from the community and Anthropic docs: it triggers at about 83.5% of the window used, and it reserves a fixed 33,000-token buffer for the summary work.
Two practical notes:
- Do not wait for auto-compaction. Run the
/compactcommand yourself at around 60% usage, not 95%. Early compaction keeps a cleaner summary because there is less junk to compress. - There is a reported edge case (GitHub issue #25620) where a window that is already completely full can block
/compactfrom running at all. If you let it max out, you may have to start fresh. Another reason to compact early.
Does the 1 million token window fix it?
Claude Code reached general availability of a 1 million token context window on March 13, 2026, with flat per-token pricing and no beta headers required. That is a lot of headroom. It still does not fix the root cause.
Two reasons. First, attention dilution: when you load files Claude does not need, the quality of its answers drops across the whole window, even the parts that matter. More irrelevant text means more noise. Second, modern multi-agent builds can burn through a million tokens fast if you let them run without limits. A bigger bucket fills slower, but an unbounded process still empties it.
In short: window size buys time. Architecture buys reliability.
CLAUDE.md is prime real estate, so keep it lean
CLAUDE.md is the instructions file Claude Code reads at the start of every session. It is loaded into the window every single time, so every line is a budget item.
Keep it under roughly 200 lines. Past a density threshold, rules start getting ignored. The Chroma 2025 context-rot benchmark (reported) found model accuracy falling from about 95% to about 60% as the amount of loaded context grew past a point. A bloated CLAUDE.md does not just waste tokens. It makes Claude follow your rules worse. Cut it to the rules that actually change behavior.
Skills load knowledge only when needed
A skill is a packaged set of instructions for one domain, for example "how we write database migrations." Skills use progressive disclosure: at startup Claude scans only a short summary of each skill (about 100 tokens), and it loads the full body only when the task matches. This is the clean way to give Claude deep domain knowledge without parking all of it in the window from the first message. Knowledge sits on the shelf until it is the right moment.
Subagents have their own separate memory
A subagent is a second instance of Claude that runs in its own isolated context window and reports back only a short summary to the main session. This is the correct fix for "infinite exploration" jobs, like reading a few hundred files to find where something is defined. The subagent does the messy reading in its own window, and your main session receives a clean answer instead of a thousand lines of raw files. Claude Code subagents are how you keep big searches from drowning your main context.
Dynamic Workflows: the design-level ceiling
Dynamic Workflows (released May 28, 2026, announced June 2, 2026) lets a lead agent fan work across up to 1,000 subagents, with about 16 running at once and the rest queued, using building blocks named agent(), parallel(), and pipeline(). Each subagent gets its own clean window. This inverts the problem. Instead of nursing one giant context and hoping it lasts, you design a pipeline where no single agent ever accumulates too much. Context stops being a resource you ration and becomes a decision you make up front.
Context fix methods: when to reach for each
| Method | What it fixes | When to use it | Effort | Approx. token savings | Limitation to know |
|---|---|---|---|---|---|
| /compact | Bloated mid-session buffer | At ~60% usage, not 95% | Low | High | Can be blocked if window is 100% full |
| .claudeignore | Reads of files you never need | Repos with large build or vendor folders | Low | Medium | Only stops reads, not other output |
| CLAUDE.md trimming | Per-session fixed overhead | When rules get ignored | Low | Medium | Cutting too much loses useful guidance |
| Skills (progressive disclosure) | Domain knowledge bloat | Recurring specialized tasks | Medium | High | Needs upfront authoring |
| Subagents | File-heavy exploration | Reading hundreds of files | Medium | Very high | Summary may omit a detail you wanted |
| Dynamic Workflows | Whole-build context limits | Large multi-step builds | High | Very high | More moving parts to design and debug |
A simple routine that works
- Trim CLAUDE.md to the rules that change behavior. Stay under 200 lines.
- Add a
.claudeignorefor build output, lockfiles, and vendor folders. - Move recurring know-how into skills so it loads only on match.
- Send "go read everything" tasks to subagents.
- Run
/compactat around 60%, before the window is tight. - For large builds, design a Dynamic Workflow instead of one long session.
If you want this wired up for you, the Build This Now Code Kit ($29 one-time) ships a ready-made Claude Code harness: a lean CLAUDE.md, scoped skills, subagents, and a production SaaS skeleton with auth, Stripe payments, and PostgreSQL row-level security on every table. It is built around these context habits so you start clean.
FAQ
Why does Claude Code keep forgetting things mid-task?
Claude Code holds everything (file reads, command outputs, tool results) in one flat context window. When that window fills, earlier content is either compacted into a summary or lost. It is not selective memory. It is a single buffer that drains with every tool call.
How do I stop Claude Code from running out of context?
Run the /compact command at around 60% usage instead of waiting for 95%, keep CLAUDE.md under 200 lines, use skills to load domain knowledge only when needed, and delegate file-heavy subtasks to subagents so their reads never touch your main window.
Does the 1 million token context window fix Claude Code context problems?
The 1 million token window buys more headroom but does not fix the root cause. Loading irrelevant files dilutes attention quality across the whole window, and large multi-agent builds can still exhaust 1 million tokens if sessions run unbounded. Session architecture matters more than window size.
What is Claude Code Dynamic Workflows and does it help with context limits?
Dynamic Workflows (released May 28, 2026) lets a lead agent fan work across up to 1,000 isolated subagents, each with its own clean context window. It inverts the problem: instead of managing one giant context, you design a pipeline where no single agent accumulates too much.
設定をやめて、構築を始めよう。
AIオーケストレーション付きSaaSビルダーテンプレート。