Claude Code Pricing and Token Usage
Cut Claude Code costs by 40-70% with the right model per task, ccusage tracking, and a few environment variables most people never set.
Problem: Your Claude Code bill keeps climbing, you keep bumping into usage limits, and you are not sure which plan tier actually fits your workflow. The right model choice and a little tracking can cut costs by 40-70%.
Quick Win: Install ccusage and see exactly where your tokens are going:
npm install -g @ryoppippi/ccusage
ccusage dailyThis shows today's token spend and a cost breakdown right away.
Claude Code Pricing
Claude Code needs at least a Pro subscription ($20/month). The free tier has no terminal access.
Claude Pro ($20/month). 5x the free limits, Sonnet access, roughly 45 messages per 5-hour window. Best for learning and hobby projects.
Claude Max 5x ($100/month). 5x Pro limits (~225 messages per 5hr), generous Opus access. Best for full-time developers.
Claude Max 20x ($200/month). 20x Pro limits (~900 messages per 5hr), full Opus access. Best for heavy daily use and complex engineering.
API pay-per-use. Sonnet: $3/$15 per million input/output tokens. Opus: $15/$75 per million tokens. Best for predictable high-volume work.
Commands That Cut Costs
Model switching with /model
Switch based on task complexity:
/model sonnet # Default for 80% of tasks
/model opus # Complex architecture decisions onlyRule: start every session on Sonnet. Only switch to Opus when you need deep analysis or a big refactor.
Context control
/compact # Compress conversation when context gets long
/clear # Start fresh for unrelated tasksLong chats spend more tokens on every new message. Run /compact when Claude starts to lose the thread, and /clear when you move to a different kind of work.
Plan mode (Shift+Tab)
Press Shift+Tab twice in the terminal to enter plan mode before an expensive operation. Planning first saves money on rework. Claude sketches the approach before it writes code, so you catch issues early.
Track Your Usage
Watch your spend with ccusage reports:
ccusage daily # Daily breakdown (default)
ccusage monthly # Monthly aggregation
ccusage blocks --live # Real-time 5-hour billing windows
ccusage daily --breakdown # Per-model cost breakdownFilter by date range when you are chasing a spike:
ccusage daily --since 20250101 --until 20250131
Cost-Saving Patterns
Specific prompts beat vague ones. Compare:
# Expensive (wastes tokens on clarification)
claude "make this better"
# Efficient (immediate results)
claude "optimize readability in src/auth.js - extract constants, add error handling"Batch related tasks to use context well:
claude "update error handling in auth.js, user.js, and api.js"
Watch for expensive habits:
- Long debugging sessions. Break them into smaller, focused requests.
- Repeated explanations. Save them in CLAUDE.md.
- Full codebase reviews. Target specific files instead.
Environment Variables for Cost Control
Model switching is one lever. A few environment variables give you direct control over token spend.
Cut non-essential calls
# Suppress background model calls that aren't critical to your task
export DISABLE_NON_ESSENTIAL_MODEL_CALLS=1This turns off model calls used for non-critical features like suggestions and tips. Your core workflow is untouched, but background token use drops.
Disable cost warnings
# Suppress cost warning messages in the CLI
export DISABLE_COST_WARNINGS=1Useful once you have a budget set and don't want interruptions. Not recommended until you have a baseline from ccusage.
Prompt caching controls
Claude Code uses prompt caching by default to cut cost and latency. If you need to turn it off for debugging or benchmarking:
# Disable prompt caching globally
export DISABLE_PROMPT_CACHING=1
# Or disable per-model
export DISABLE_PROMPT_CACHING_HAIKU=1
export DISABLE_PROMPT_CACHING_SONNET=1
export DISABLE_PROMPT_CACHING_OPUS=1The global setting overrides the per-model settings. Keep caching on for production, it cuts costs a lot on repeated context.
The opusplan strategy
If you want Opus-level reasoning without Opus-level bills, the opusplan model alias does a hybrid:
claude --model opusplan
With opusplan, Claude runs Opus during plan mode for reasoning and architecture calls, then switches to Sonnet for the code generation and implementation. You get Opus quality where it counts (planning) without paying Opus rates for every line of code.
This is one of the most effective cost moves if you use planning mode regularly.
When Things Go Wrong
Getting close to the limit? Switch models and compact:
/model sonnet
/compactHit a rate limit? Wait for the hourly reset, batch requests instead of firing them rapidly, or move up a plan tier.
Related Pages
- Install ccusage and run
ccusage daily --breakdown - Context management for less token waste
- Model selection for your workflow
- Troubleshooting tips to avoid expensive debugging
Track weekly and adjust from the data. Most developers cut costs 40-70% with these moves.
Stop configuring. Start building.