Build This Now
Build This Now
What Is Claude Code?Claude Code InstallationClaude Code Native InstallerYour First Claude Code Project
1M Context Window in Claude CodeContext EngineeringContext Management in Claude CodeClaude Code Context Buffer
Get Build This Now
speedy_devvkoen_salo
Blog/Handbook/Core/Context Management in Claude Code

Context Management in Claude Code

How to stretch Claude Code across big projects. Covers the 80/20 rule, /compact, CLAUDE.md, chunking, clean recovery, and four open-source tools that cut context 10×.

Problem: On long conversations, Claude Code starts forgetting your project. You re-explain the same decisions, or Claude ships code that clashes with what it wrote twenty minutes earlier.

Quick Win: Keep an eye on the token percentage in the status bar. Past 80%, drop the session and spin up a fresh one with claude before any complex multi-file work:

# Exit current session (Ctrl+C or type 'exit')
# Then start fresh
claude

That one habit alone stops the slow degradation that tanks productivity on large projects. Everything below is five more moves that stack on top of it.

In a hurry? Jump straight to the open-source stack below. Four tools that automate most of this and cut context roughly 10×.

The Mental Model: Active Working Memory

The context window is not a passive archive. Treat it as active working memory. Once it fills up, Claude has a harder time holding your whole project in view, the same way you couldn't keep a 50-page doc in your head while typing out code.

Context starvation has a shape. Four failure modes show up:

  • Inconsistent code that contradicts work done earlier in the same chat
  • Repeated questions about project layout you already walked through
  • Lost architectural decisions and naming choices
  • Breaking changes that trample patterns you set up minutes ago

Spotting the moment these start is more useful than pretending you can dodge them.

The 80/20 Rule

Leave the last 20% of the window alone on any serious multi-file task. Refactors, new features, and deep debugging all need working memory to track how components relate. Burn through that reserve and the model slips.

High Context Tasks (stop at 80% capacity):

  • Large-scale refactoring across multiple files
  • Feature implementation spanning several components
  • Complex debugging requiring architectural understanding
  • Code reviews with cross-file dependencies

Low Context Tasks (safe to continue past 80%):

  • Single-file edits with clear scope
  • Independent utility function creation
  • Documentation updates
  • Simple, localized bug fixes

The rule experienced users settle on: put the hard work in the first 80% of a session, and save the last stretch for small jobs that don't need the whole architecture in Claude's head.

/compact and When to Use It

The /compact slash command squeezes your chat history into a summary. Compaction runs fast now because Claude keeps a rolling session memory in the background, so /compact just loads that summary into a clean context instead of re-summarizing from zero.

Inside that session memory:

  • Session title and current status
  • Completed work and key results
  • Discussion points and open questions
  • A work log updated with every message

You can find session memory at ~/.claude/projects/[project]/[session]/session_memory. Because the summary is already sitting there, compaction no longer burns two minutes.

When to compact manually:

  • After completing a major feature (before starting the next)
  • Before switching from research mode to implementation mode
  • When Claude starts repeating questions or contradicting earlier decisions
  • At natural breakpoints between unrelated tasks

When NOT to compact:

  • Mid-debugging, when specific error messages and stack traces matter
  • During complex refactoring where file-level details are critical
  • Right before integration work that depends on context from component building

The goal is simple: give Claude the minimum context the task actually needs. Performance, token spend, and cost all improve at the same time. For a full walkthrough of /compact, /clear, and the rest of the slash commands, check the interactive mode guide.

Want the mechanics of the compaction buffer, how the 33K token reservation works, and how to shift the trigger threshold? The context buffer management guide goes into it.

CLAUDE.md as Persistent Memory

A CLAUDE.md file survives across sessions on its own. Put anything that should never disappear into it:

# CLAUDE.md
 
## Project Architecture
 
- Frontend: Next.js 15 with App Router
- Database: PostgreSQL with Prisma ORM
- Auth: NextAuth with Google provider
 
## Current Sprint Focus
 
- Building user dashboard
- Integrating payment processing
 
## Patterns and Conventions
 
- All API routes use tRPC
- Components follow compound pattern
- Error handling via custom ErrorBoundary

Claude loads this file on every session start, so you get free context that lives through restarts. One warning: keep it lean. Every token CLAUDE.md eats is a token you can't spend on the actual chat. Only document what Claude needs every single session, not the entire project.

When a new pattern gets settled mid-session, lock it in on purpose:

claude "Document this pattern in CLAUDE.md so we maintain consistency:
[describe the pattern/decision]"

That turns a one-off session decision into project-level knowledge. The memory optimization guide has advanced ways to keep your persistent context tight.

Strategic Task Chunking

Don't run Claude into the ground. Slice work into context-sized chunks with clean breakpoints in between.

Complete components before integration:

# First session: build the component completely
claude "Build the UserProfile component with all props and styling"
 
# New session: integrate with the larger system
claude "Integrate UserProfile with the dashboard layout"

Finish research phases before implementation:

# Research session
claude "Research authentication patterns for our Next.js app - document options"
 
# Implementation session (fresh context)
claude "Implement OAuth using the NextAuth pattern"

Between sessions, write handoff notes that carry everything the next session needs:

claude "Create comprehensive handoff notes including:
1. Current project state and architecture
2. Coding patterns and conventions we've established
3. Key decisions made and reasoning
4. Specific next steps with implementation details
5. Files that will need attention and why"

Stash those notes in the project. Start fresh, point Claude at them, and you're back in sync. Handoff notes are especially useful when you can see the context ceiling coming and want to keep more detail than auto-compaction's lossy summary would.

Context Recovery Techniques

Sometimes context is already gone. Compaction hit, the session crashed, or drift got bad. Run this recovery sequence:

  1. Quick recovery: Reference your most recent checkpoint notes or session file
  2. Pattern review: Ask Claude to scan recent files and identify established patterns
  3. Architecture refresh: Provide a brief project overview focusing on current components
  4. Continuation strategy: Start with small, isolated tasks while context rebuilds
  5. Fresh start: Use /clear when context becomes too corrupted to salvage. Sometimes starting clean with good handoff notes is faster than fighting degraded context

For anything past an hour of work, it's cheaper to prevent drift than to recover from it. Every half hour or so, run a quick refresh:

# Quick context refresh
claude "Update your understanding of our project: review recent changes,
confirm current patterns, and note any shifts in approach"

That keeps Claude from wandering off the actual requirements as the session stretches.

Monitoring Context Usage

Spotting the cliff before you hit it is half the battle. Three approaches, from easy to advanced:

1. Watch the status bar. Token percentage lives at the bottom of the terminal. It's your main signal. Get into the habit of checking it before every new task.

2. Run /context for a breakdown. The /context command shows where your tokens are actually going: system prompt, tools, memory files, skills, and conversation history. If memory files alone are chewing 15% of the window before you've typed anything, that's a problem worth fixing.

3. Use StatusLine for automated monitoring. StatusLine pulls live context metrics and can fire warnings at thresholds you set. Implementation details (including how to compute real free space around the compaction buffer) live in the context buffer management guide.

One practical note: hitting compaction before a task finishes is rarely a bigger-window problem. It's a task-size problem. Split the work smaller.

The Open-Source Stack That Does the Work For You

Everything above is a habit you run yourself. Four open-source tools automate the same moves and stack on top of each other. Install them once and every session gets lighter without any change to how you work.

lean-ctx: compress every call

lean-ctx sits between Claude Code and the model and compresses everything on the wire. A shell hook shrinks CLI output (git, docker, test runners) and an MCP server caches file reads so the second read costs about thirteen tokens instead of thirteen thousand.

claude mcp add lean-ctx lean-ctx

Typical savings: a 30K-token file read drops to 195 once cached. git status drops from 8K to 2.4K. Test runner output drops 90%. Zero config after install, zero telemetry, single Rust binary.

caveman: shrink every reply

caveman is a Claude Code skill that rewrites replies in telegraphic "caveman speak": drop articles, drop filler, keep every technical term, code block, and file path verbatim. Same intent, 70% fewer words. Ships with a companion command that rewrites your CLAUDE.md the same way so the input side shrinks too.

claude plugin install caveman@caveman

Benchmark average across ten real coding tasks: 65% fewer output tokens. Paired with a compressed CLAUDE.md, the input side drops roughly 45%.

symdex: read one line, not one file

symdex pre-indexes your repo into a local SQLite database: symbols, byte offsets, call graph, HTTP routes, semantic embeddings. Claude queries the index and jumps to the exact span it needs instead of reading an entire file to find one function.

pip install symdex
symdex index ./project --repo myproject
claude mcp add symdex -- uvx symdex serve

A typical symbol lookup drops from 7,500 tokens to 200. Works with Python, TypeScript, Go, Rust, Java, Kotlin, Swift, Ruby, and ten more languages.

Agent Skills for Context Engineering: teach Claude the patterns

A library of thirteen skills that Claude Code loads on demand: context compression, filesystem offloading, multi-agent orchestration, memory systems, lost-in-the-middle mitigation. Only the skill the current task needs loads into the window. The rest stay dormant.

/plugin marketplace add muratcankoylan/Agent-Skills-for-Context-Engineering
/plugin install context-engineering@context-engineering-marketplace

Less about immediate token savings and more about teaching the agent how to manage its own context over long sessions. This is the one that stops you from hitting the cliff in the first place.

Stack Them in This Order

Each tool attacks a different layer, so they compound. Install symdex first, so Claude stops reading whole files to find one function. Add lean-ctx, so every remaining read and CLI call gets compressed on the way in. Add caveman, so the replies coming back shrink. Finally add the Context Engineering skills, so long sessions offload state to files instead of filling the window.

Combined effect: roughly 10× less context per session at zero change to your workflow.

Constraints as Training

Token limits push you toward habits that make you a better Claude Code user anyway:

  • Explicit file selection: Include only relevant files, not entire codebases
  • Clear task definition: Break objectives into concrete, actionable steps
  • Priority-based organization: Structure prompts with critical details first
  • Compact examples: Provide minimal but representative code samples

Those habits pay off even when windows get bigger. Developers who work with constraints stay sharper at any size. On Max, Team, or Enterprise the 1M context window gives you 5x the usable room at no price premium, but the discipline from 200K is what keeps you effective.

Next Actions

Immediate: Check your token percentage in the status bar. Past 80%? Exit and restart before the next complex task.

This week: Practice chunking one larger feature into component-then-integration phases.

Ongoing: Grow your CLAUDE.md with project context that lives across sessions. For the full framework on how information should flow into Claude, see context engineering.

Get context management right and you handle projects 5x the size of developers who never learn the limits. This is why senior devs run "context resets" between planning and execution. Pick this up alongside four other compounding Claude Code habits.

More in this guide

  • Agent Fundamentals
    Five ways to build specialized agents in Claude Code, from sub-agents to .claude/agents/ definitions to perspective prompts.
  • Agent Patterns
    Orchestrator, fan-out, validation chain, specialist routing, progressive refinement, and watchdog. Six ways to wire sub-agents in Claude Code.
  • Agent Teams Best Practices
    Battle-tested patterns for Claude Code agent teams. Troubleshooting, limitations, plan mode quirks, and fixes shipped from v2.1.33 through v2.1.45.
  • Agent Teams Controls
    Stop your agent team lead from grabbing implementation work. Configure delegate mode, plan approval, hooks, and CLAUDE.md for teams.
  • Agent Teams Prompt Templates
    Ten tested Agent Teams prompts for Claude Code. Code review, debugging, feature builds, architecture calls, and campaign research. Paste and go.

Stop configuring. Start building.

SaaS builder templates with AI orchestration.

Get Build This Now

Context Engineering

Context engineering is the discipline of deciding what Claude sees, when it sees it, and what gets left out.

Claude Code Context Buffer

The autocompact buffer shrank from 45K to 33K tokens in early 2026. Here's what changed and how to work around it.

On this page

The Mental Model: Active Working Memory
The 80/20 Rule
/compact and When to Use It
CLAUDE.md as Persistent Memory
Strategic Task Chunking
Context Recovery Techniques
Monitoring Context Usage
The Open-Source Stack That Does the Work For You
lean-ctx: compress every call
caveman: shrink every reply
symdex: read one line, not one file
Agent Skills for Context Engineering: teach Claude the patterns
Stack Them in This Order
Constraints as Training
Next Actions

Stop configuring. Start building.

SaaS builder templates with AI orchestration.

Get Build This Now