Context Engineering
Context engineering is the discipline of deciding what Claude sees, when it sees it, and what gets left out.
Problem: Claude Code keeps flipping between brilliant and frustrating. One session nails the task. The next wanders off on the same request. You cannot predict which version you get.
Quick Win: Stop loading everything into the prompt up front. Hand information to Claude in stages instead:
# Bad: dump everything upfront
claude "Here's my entire codebase architecture, all conventions,
every pattern we use, plus the task..."
# Good: let Skills load what's needed, when needed
claude "Build the auth module"
# Skills load authentication patterns only when Claude needs themPrompt engineering is about phrasing a question well. Context engineering is about making sure Claude has the right facts at the right moment.
What Is Context Engineering?
Context engineering is how you design the flow of information into the model. Get it right and Claude Code starts behaving like a coding partner that understands your intent. Get it wrong and you spend the session fighting it.
Here is the shape of the problem. A context window is a bounded workspace measured in tokens. Instructions, retrieved docs, tool output, and conversation history share that space. Hit the ceiling and older content falls off. Organize it badly and Claude loses the thread.
Which means context is a scarce resource. How you structure it is what separates a build that ships the feature you had in mind from a build that almost gets there.
The Context Window Challenge
When context slips out of control, Claude Code tends to fail in four repeatable ways:
| Failure Mode | What Happens | Prevention |
|---|---|---|
| Context Poisoning | Errors compound as agents reuse contaminated context | Fresh sessions, /clear command |
| Context Distraction | Over-reliance on repeating prior behavior | Strategic chunking |
| Context Confusion | Irrelevant tools or docs misdirect the agent | Skills system |
| Context Clash | Contradictory information creates conflicts | CLAUDE.md as single source of truth |
Learn to spot the four. They are the pattern you are fighting.
The Six Pillars Framework
Context engineering rests on six connected ideas. Here is how each one lands inside Claude Code:
1. Agents
An AI agent is an LLM wired up to tools, memory, and reasoning so it can chase a goal. Agents decide what enters the context, what sticks around, and what gets dropped.
Claude Code moved from single-agent to multi-agent once subagents shipped. The context-engineering implication is direct:
# Single agent: one context window handles everything
claude "Research, plan, build, test, and deploy the payment system"
# Multi-agent: specialized contexts, distributed load
# Central AI delegates to focused subagents
claude "Build the payment system"
# → Research agent gathers requirements
# → Backend agent builds Stripe integration
# → Frontend agent creates checkout UI
# → Each agent has clean, focused contextMulti-agent setups prevent context confusion by giving each subagent a narrower brief. Your central AI becomes the CTO, handing specialized work to the right specialist.
2. Query Augmentation
Real user prompts are rough around the edges. Query augmentation tightens them up before the work starts.
If your central Claude Code session is set up as a co-founder or a dev manager, augmentation falls out of that framing for free:
Your input: "fix the auth bug"
Central AI refinement:
→ Analyze recent changes to auth module
→ Identify error patterns in logs
→ Scope to affected files (src/lib/auth.ts)
→ Generate targeted fix with test coverage
Subagent receives: Clear, scoped task with contextYour rough sentence passes through the central AI first. By the time it reaches a subagent, it is a scoped task, not your raw one-liner.
3. Retrieval
Retrieval is how outside information gets pulled into the window on demand. The trade-off is chunk size. Small chunks are precise but lose surrounding context. Big chunks bring rich context at the cost of tokens.
Claude Code has no native retrieval today. Partial workarounds exist through MCPs and CLI tools, but it is not yet a platform strength. For now, your CLAUDE.md and Skills are the retrieval layer:
# CLAUDE.md - Your retrieval substitute
## Architecture (always loaded)
- Next.js 15, App Router, TypeScript strict
## Patterns (reference when needed)
See /docs/patterns/ for component conventions4. Prompting Techniques
Here is the part most people miss. Dumping information into the window does not guarantee strong output. What matters is the order, the timing, and the channel.
Research keeps finding the same thing: the start and end of the context window get more attention than the middle. That is why Skills work so well:
Conversation start:
├── CLAUDE.md (beginning of context - high attention)
├── Your initial prompt
├── ... conversation history ...
├── Claude's work
└── Skill loads HERE (end of context - high attention)
└── Fresh, relevant instructions at peak attentionUntil the skill loads, Claude runs lean. Once it fires mid-session, its instructions drop into the bottom of the window, right in the high-attention zone, exactly when the expertise is needed. That is progressive disclosure, and it reclaims tokens a front-loaded CLAUDE.md would otherwise burn.
5. Memory
Memory is what turns a stateless model into something that remembers what you did together.
Claude Code's real memory surfaces:
| What | How It Works | Persistence |
|---|---|---|
| CLAUDE.md | Loads at session start, treated as authoritative | Permanent |
| Skills | Load on-demand when triggered | Permanent |
| Session files | .claude/tasks/session-current.md tracks progress | Across sessions |
| Conversation | Current context window | This session |
Pair session tracking with living docs and you get a memory layer tuned to this repo. Claude writes to it as decisions get made, and reads from it when you come back the next day. Over weeks, your assistant learns your codebase.
6. Tools
Tools are how reasoning reaches the real world. Claude Code shipped with the basics: Read, Write, Edit, Bash, and MCP for outside services.
Skills added something different. Claude can run an executable script without loading its implementation into the context. That is the MCP-S CLI idea: Claude follows a protocol, and the internals stay invisible.
Example: a documentation-research skill built on Context7 MCP:
# .claude/skills/documentation-research/SKILL.md
---
name: documentation-research
description: Fetch library docs using Context7 API
---
## When to Use
User needs current documentation for any library
## Workflow
1. Resolve library ID via Context7
2. Fetch relevant documentation
3. Apply to current task
## Tools Available
- mcp**context7**resolve-library-id
- mcp**context7**get-library-docsClaude reaches the MCP tools through the skill interface. Protocol driven, context efficient, no source reading required.
Implementing the Framework
Today: audit your CLAUDE.md. Is it laid out for retrieval? Are the patterns you care about somewhere Claude can find them?
This week: build Skills for the workflows you repeat. Each skill is a guard against context confusion, because expertise loads on demand.
Ongoing: watch for the four failure modes. The moment Claude repeats old mistakes or ignores what you said, contamination has set in. Start fresh.
The Bottom Line
Reliable output is not a bigger-model problem. It is an information-flow problem.
The six pillars stack together:
- Agents distribute context across specialists
- Query augmentation refines messy input
- Retrieval (via CLAUDE.md/Skills) surfaces relevant info
- Prompting layers information strategically
- Memory maintains state across sessions
- Tools extend capabilities efficiently
Get these right, and Claude Code becomes a coding partner you can hand any idea to and trust with the build.
Next steps:
- 1M context window guide for the latest on GA availability and unified pricing
- Context buffer management for understanding the 33K reservation
- Context management for token optimization
- Memory optimization for persistence strategies
- Skills guide for on-demand expertise loading
- Sub-agent design for multi-agent architectures
Stop configuring. Start building.
1M Context Window in Claude Code
Anthropic shipped the 1M token window on Opus 4.6 and Sonnet 4.6. Flat pricing, no beta header, fewer compactions.
Context Management in Claude Code
How to stretch Claude Code across big projects. Covers the 80/20 rule, /compact, CLAUDE.md, chunking, clean recovery, and four open-source tools that cut context 10×.