Context Engineering

Problem: Claude Code keeps flipping between brilliant and frustrating. One session nails the task. The next wanders off on the same request. You cannot predict which version you get.

Quick Win: Stop loading everything into the prompt up front. Hand information to Claude in stages instead:

# Bad: dump everything upfront
claude "Here's my entire codebase architecture, all conventions,
every pattern we use, plus the task..."
 
# Good: let Skills load what's needed, when needed
claude "Build the auth module"
# Skills load authentication patterns only when Claude needs them

Prompt engineering is about phrasing a question well. Context engineering is about making sure Claude has the right facts at the right moment.

That also marks the boundary with the neighboring pages. Read this page when the design problem is information flow: what to preload, what to defer, and how to keep irrelevant context out. If you need session recovery rules, read Context Management. If you want a task-specific way to preload the right bundle at session start, read Dynamic Starting Context.

What Is Context Engineering?

Context engineering is how you design the flow of information into the model. Get it right and Claude Code starts behaving like a coding partner that understands your intent. Get it wrong and you spend the session fighting it.

Here is the shape of the problem. A context window is a bounded workspace measured in tokens. Instructions, retrieved docs, tool output, and conversation history share that space. Hit the ceiling and older content falls off. Organize it badly and Claude loses the thread.

Which means context is a scarce resource. How you structure it is what separates a build that ships the feature you had in mind from a build that almost gets there.

The Context Window Challenge

When context slips out of control, Claude Code tends to fail in four repeatable ways:

Failure Mode	What Happens	Prevention
Context Poisoning	Errors compound as agents reuse contaminated context	Fresh sessions, `/clear` command
Context Distraction	Over-reliance on repeating prior behavior	Strategic chunking
Context Confusion	Irrelevant tools or docs misdirect the agent	Skills system
Context Clash	Contradictory information creates conflicts	CLAUDE.md as single source of truth

Learn to spot the four. They are the pattern you are fighting.

The Six Pillars Framework

Context engineering rests on six connected ideas. Here is how each one lands inside Claude Code:

1. Agents

An AI agent is an LLM wired up to tools, memory, and reasoning so it can chase a goal. Agents decide what enters the context, what sticks around, and what gets dropped.

Claude Code moved from single-agent to multi-agent once subagents shipped. The context-engineering implication is direct:

# Single agent: one context window handles everything
claude "Research, plan, build, test, and deploy the payment system"
 
# Multi-agent: specialized contexts, distributed load
# Central AI delegates to focused subagents
claude "Build the payment system"
# → Research agent gathers requirements
# → Backend agent builds Stripe integration
# → Frontend agent creates checkout UI
# → Each agent has clean, focused context

Multi-agent setups prevent context confusion by giving each subagent a narrower brief. Your central AI becomes the CTO, handing specialized work to the right specialist.

2. Query Augmentation

Real user prompts are rough around the edges. Query augmentation tightens them up before the work starts.

If your central Claude Code session is set up as a co-founder or a dev manager, augmentation falls out of that framing for free:

Your input: "fix the auth bug"

Central AI refinement:
→ Analyze recent changes to auth module
→ Identify error patterns in logs
→ Scope to affected files (src/lib/auth.ts)
→ Generate targeted fix with test coverage

Subagent receives: Clear, scoped task with context

Your rough sentence passes through the central AI first. By the time it reaches a subagent, it is a scoped task, not your raw one-liner.

3. Retrieval

Retrieval is how outside information gets pulled into the window on demand. The trade-off is chunk size. Small chunks are precise but lose surrounding context. Big chunks bring rich context at the cost of tokens.

Claude Code has no native retrieval today. Partial workarounds exist through MCPs and CLI tools, but it is not yet a platform strength. For now, your CLAUDE.md and Skills are the retrieval layer:

# CLAUDE.md - Your retrieval substitute
 
## Architecture (always loaded)
 
- Next.js 15, App Router, TypeScript strict
 
## Patterns (reference when needed)
 
See /docs/patterns/ for component conventions

4. Prompting Techniques

Here is the part most people miss. Dumping information into the window does not guarantee strong output. What matters is the order, the timing, and the channel.

Research keeps finding the same thing: the start and end of the context window get more attention than the middle. That is why Skills work so well:

Conversation start:
├── CLAUDE.md (beginning of context - high attention)
├── Your initial prompt
├── ... conversation history ...
├── Claude's work
└── Skill loads HERE (end of context - high attention)
    └── Fresh, relevant instructions at peak attention

Until the skill loads, Claude runs lean. Once it fires mid-session, its instructions drop into the bottom of the window, right in the high-attention zone, exactly when the expertise is needed. That is progressive disclosure, and it reclaims tokens a front-loaded CLAUDE.md would otherwise burn.

5. Memory

Memory is what turns a stateless model into something that remembers what you did together.

Claude Code's real memory surfaces:

What	How It Works	Persistence
CLAUDE.md	Loads at session start, treated as authoritative	Permanent
Skills	Load on-demand when triggered	Permanent
Session files	`.claude/tasks/session-current.md` tracks progress	Across sessions
Conversation	Current context window	This session

Pair session tracking with living docs and you get a memory layer tuned to this repo. Claude writes to it as decisions get made, and reads from it when you come back the next day. Over weeks, your assistant learns your codebase.

6. Tools

Tools are how reasoning reaches the real world. Claude Code shipped with the basics: Read, Write, Edit, Bash, and MCP for outside services.

Skills added something different. Claude can run an executable script without loading its implementation into the context. That is the MCP-S CLI idea: Claude follows a protocol, and the internals stay invisible.

Example: a documentation-research skill built on Context7 MCP:

# .claude/skills/documentation-research/SKILL.md
 
---
 
name: documentation-research
description: Fetch library docs using Context7 API
 
---
 
## When to Use
 
User needs current documentation for any library
 
## Workflow
 
1. Resolve library ID via Context7
2. Fetch relevant documentation
3. Apply to current task
 
## Tools Available
 
- mcp**context7**resolve-library-id
- mcp**context7**get-library-docs

Claude reaches the MCP tools through the skill interface. Protocol driven, context efficient, no source reading required.

Real Examples That Make the Difference Obvious

The easiest way to understand context engineering is to compare the same task with bad context and engineered context.

Example 1: Security triage

Bad version:

claude "check if this auth flow is secure"

That prompt is too wide. Claude has no threat model, no system boundary, and no clue which code matters.

Engineered version:

claude "Review the password reset flow for account-takeover risk.

Scope:
- src/auth/reset.ts
- app/api/reset-password/route.ts
- middleware/session.ts

Focus:
- token generation and expiry
- user enumeration
- rate limiting
- replay risk

Output:
1. concrete issues
2. exploit path
3. exact fix
4. regression test plan"

Same model. Different result. The second version gives Claude a system boundary, an attack lens, and an output contract. That is context engineering.

Example 2: Large refactor

Bad version:

claude "migrate our forms to the new validation layer"

Engineered version:

claude "Migrate signup + billing forms from ad-hoc validation to Zod.

Read first:
- docs/forms/validation-plan.md
- components/forms/*

Do not touch:
- admin flows
- onboarding wizard

Definition of done:
- shared schema extracted
- client + server validation aligned
- error copy preserved
- tests updated for changed messages"

The difference is not prompt wording polish. It is controlled scope. Claude now knows what to read, what not to touch, and what counts as finished.

Example 3: Content production pipeline

Bad version:

claude "write a post about Claude Code hooks"

Engineered version:

claude "Write a hooks article for technical readers evaluating Claude Code.

Use:
- existing hooks-guide.mdx
- permission-hook-guide.mdx
- session-lifecycle-hooks.mdx

Must include:
- one production workflow
- one failure mode
- one copy-paste config example
- internal links to the three related guides

Avoid:
- generic 'AI changes everything' framing
- repeating definitions already covered in the linked pages"

Now Claude is not writing into a vacuum. It is writing into a real content system with known neighbors, overlap constraints, and quality requirements.

Where Context Engineering Pays Off Most

The payoff is highest when the task is either high-risk or high-ambiguity:

Task Type	Why Context Engineering Matters
Security reviews	You need clear scope, threat model, and evidence thresholds
Production bug fixes	Too much unrelated history pulls the model toward the wrong root cause
Migrations	Boundary control matters more than raw intelligence
Long-running agent workflows	Context rot compounds over many steps
SEO / content systems	Models need overlap control so pages do not cannibalize each other

When the task is cheap, fuzzy, and reversible, sloppy context is survivable. When the task is expensive or risky, sloppy context is what creates fake confidence.

Implementing the Framework

Today: audit your CLAUDE.md. Is it laid out for retrieval? Are the patterns you care about somewhere Claude can find them?

This week: build Skills for the workflows you repeat. Each skill is a guard against context confusion, because expertise loads on demand.

Ongoing: watch for the four failure modes. The moment Claude repeats old mistakes or ignores what you said, contamination has set in. Start fresh.

The Bottom Line

Reliable output is not a bigger-model problem. It is an information-flow problem.

The six pillars stack together:

Agents distribute context across specialists
Query augmentation refines messy input
Retrieval (via CLAUDE.md/Skills) surfaces relevant info
Prompting layers information strategically
Memory maintains state across sessions
Tools extend capabilities efficiently

Get these right, and Claude Code becomes a coding partner you can hand any idea to and trust with the build.

Next steps:

1M context window guide for the latest on GA availability and unified pricing
Context buffer management for understanding the 33K reservation
Context management for token optimization
Memory optimization for persistence strategies
Skills guide for on-demand expertise loading
Sub-agent design for multi-agent architectures

Context Engineering

On this page