Claude Code Pricing and Token Usage

Problem: Your Claude Code bill keeps climbing, you keep bumping into usage limits, and you are not sure which plan tier actually fits your workflow. The right model choice and a little tracking can cut costs by 40-70%.

Quick Win: Install ccusage and see exactly where your tokens are going:

npm install -g @ryoppippi/ccusage
ccusage daily

This shows today's token spend and a cost breakdown right away.

Claude Code Pricing

Claude Code needs at least a Pro subscription ($20/month). The free tier has no terminal access.

Claude Pro ($20/month). 5x the free limits, Sonnet access, roughly 45 messages per 5-hour window. Best for learning and hobby projects.

Claude Max 5x ($100/month). 5x Pro limits (~225 messages per 5hr), generous Opus access. Best for full-time developers.

Claude Max 20x ($200/month). 20x Pro limits (~900 messages per 5hr), full Opus access. Best for heavy daily use and complex engineering.

API pay-per-use. Sonnet: $3/$15 per million input/output tokens. Opus: $15/$75 per million tokens. Best for predictable high-volume work.

Commands That Cut Costs

Model switching with /model

Switch based on task complexity:

/model sonnet   # Default for 80% of tasks
/model opus     # Complex architecture decisions only

Rule: start every session on Sonnet. Only switch to Opus when you need deep analysis or a big refactor.

Context control

/compact    # Compress conversation when context gets long
/clear      # Start fresh for unrelated tasks

Long chats spend more tokens on every new message. Run /compact when Claude starts to lose the thread, and /clear when you move to a different kind of work.

Plan mode (Shift+Tab)

Press Shift+Tab twice in the terminal to enter plan mode before an expensive operation. Planning first saves money on rework. Claude sketches the approach before it writes code, so you catch issues early.

Track Your Usage

Watch your spend with ccusage reports:

ccusage daily              # Daily breakdown (default)
ccusage monthly            # Monthly aggregation
ccusage blocks --live      # Real-time 5-hour billing windows
ccusage daily --breakdown  # Per-model cost breakdown

Filter by date range when you are chasing a spike:

ccusage daily --since 20250101 --until 20250131

Cost-Saving Patterns

Specific prompts beat vague ones. Compare:

# Expensive (wastes tokens on clarification)
claude "make this better"
 
# Efficient (immediate results)
claude "optimize readability in src/auth.js - extract constants, add error handling"

Batch related tasks to use context well:

claude "update error handling in auth.js, user.js, and api.js"

Watch for expensive habits:

Long debugging sessions. Break them into smaller, focused requests.
Repeated explanations. Save them in CLAUDE.md.
Full codebase reviews. Target specific files instead.

Environment Variables for Cost Control

Model switching is one lever. A few environment variables give you direct control over token spend.

Cut non-essential calls

# Suppress background model calls that aren't critical to your task
export DISABLE_NON_ESSENTIAL_MODEL_CALLS=1

This turns off model calls used for non-critical features like suggestions and tips. Your core workflow is untouched, but background token use drops.

Disable cost warnings

# Suppress cost warning messages in the CLI
export DISABLE_COST_WARNINGS=1

Useful once you have a budget set and don't want interruptions. Not recommended until you have a baseline from ccusage.

Prompt caching controls

Claude Code uses prompt caching by default to cut cost and latency. If you need to turn it off for debugging or benchmarking:

# Disable prompt caching globally
export DISABLE_PROMPT_CACHING=1
 
# Or disable per-model
export DISABLE_PROMPT_CACHING_HAIKU=1
export DISABLE_PROMPT_CACHING_SONNET=1
export DISABLE_PROMPT_CACHING_OPUS=1

The global setting overrides the per-model settings. Keep caching on for production, it cuts costs a lot on repeated context.

The opusplan strategy

If you want Opus-level reasoning without Opus-level bills, the opusplan model alias does a hybrid:

claude --model opusplan

With opusplan, Claude runs Opus during plan mode for reasoning and architecture calls, then switches to Sonnet for the code generation and implementation. You get Opus quality where it counts (planning) without paying Opus rates for every line of code.

This is one of the most effective cost moves if you use planning mode regularly.

When Things Go Wrong

Getting close to the limit? Switch models and compact:

/model sonnet
/compact

Hit a rate limit? Wait for the hourly reset, batch requests instead of firing them rapidly, or move up a plan tier.

Install ccusage and run ccusage daily --breakdown
Context management for less token waste
Model selection for your workflow
Troubleshooting tips to avoid expensive debugging

Track weekly and adjust from the data. Most developers cut costs 40-70% with these moves.

Claude Code Pricing and Token Usage

On this page