Build This Now
Build This Now
What Is Claude Code?Claude Code InstallationClaude Code Native InstallerYour First Claude Code Project
Claude Code Best PracticesClaude Code on a VPSGit IntegrationClaude Code ReviewClaude Code WorktreesClaude Code Remote ControlClaude Code ChannelsClaude Code Scheduled TasksClaude Code PermissionsClaude Code Auto ModeFeedback LoopsTodo WorkflowsClaude Code TasksProject TemplatesClaude Code Pricing and Token Usage
Get Build This Now
speedy_devvkoen_salo
Blog/Handbook/Workflow/Claude Code Pricing and Token Usage

Claude Code Pricing and Token Usage

Cut Claude Code costs by 40-70% with the right model per task, ccusage tracking, and a few environment variables most people never set.

Problem: Your Claude Code bill keeps climbing, you keep bumping into usage limits, and you are not sure which plan tier actually fits your workflow. The right model choice and a little tracking can cut costs by 40-70%.

Quick Win: Install ccusage and see exactly where your tokens are going:

npm install -g @ryoppippi/ccusage
ccusage daily

This shows today's token spend and a cost breakdown right away.

Claude Code Pricing

Claude Code needs at least a Pro subscription ($20/month). The free tier has no terminal access.

Claude Pro ($20/month). 5x the free limits, Sonnet access, roughly 45 messages per 5-hour window. Best for learning and hobby projects.

Claude Max 5x ($100/month). 5x Pro limits (~225 messages per 5hr), generous Opus access. Best for full-time developers.

Claude Max 20x ($200/month). 20x Pro limits (~900 messages per 5hr), full Opus access. Best for heavy daily use and complex engineering.

API pay-per-use. Sonnet: $3/$15 per million input/output tokens. Opus: $15/$75 per million tokens. Best for predictable high-volume work.

Commands That Cut Costs

Model switching with /model

Switch based on task complexity:

/model sonnet   # Default for 80% of tasks
/model opus     # Complex architecture decisions only

Rule: start every session on Sonnet. Only switch to Opus when you need deep analysis or a big refactor.

Context control

/compact    # Compress conversation when context gets long
/clear      # Start fresh for unrelated tasks

Long chats spend more tokens on every new message. Run /compact when Claude starts to lose the thread, and /clear when you move to a different kind of work.

Plan mode (Shift+Tab)

Press Shift+Tab twice in the terminal to enter plan mode before an expensive operation. Planning first saves money on rework. Claude sketches the approach before it writes code, so you catch issues early.

Track Your Usage

Watch your spend with ccusage reports:

ccusage daily              # Daily breakdown (default)
ccusage monthly            # Monthly aggregation
ccusage blocks --live      # Real-time 5-hour billing windows
ccusage daily --breakdown  # Per-model cost breakdown

Filter by date range when you are chasing a spike:

ccusage daily --since 20250101 --until 20250131

Cost-Saving Patterns

Specific prompts beat vague ones. Compare:

# Expensive (wastes tokens on clarification)
claude "make this better"
 
# Efficient (immediate results)
claude "optimize readability in src/auth.js - extract constants, add error handling"

Batch related tasks to use context well:

claude "update error handling in auth.js, user.js, and api.js"

Watch for expensive habits:

  • Long debugging sessions. Break them into smaller, focused requests.
  • Repeated explanations. Save them in CLAUDE.md.
  • Full codebase reviews. Target specific files instead.

Environment Variables for Cost Control

Model switching is one lever. A few environment variables give you direct control over token spend.

Cut non-essential calls

# Suppress background model calls that aren't critical to your task
export DISABLE_NON_ESSENTIAL_MODEL_CALLS=1

This turns off model calls used for non-critical features like suggestions and tips. Your core workflow is untouched, but background token use drops.

Disable cost warnings

# Suppress cost warning messages in the CLI
export DISABLE_COST_WARNINGS=1

Useful once you have a budget set and don't want interruptions. Not recommended until you have a baseline from ccusage.

Prompt caching controls

Claude Code uses prompt caching by default to cut cost and latency. If you need to turn it off for debugging or benchmarking:

# Disable prompt caching globally
export DISABLE_PROMPT_CACHING=1
 
# Or disable per-model
export DISABLE_PROMPT_CACHING_HAIKU=1
export DISABLE_PROMPT_CACHING_SONNET=1
export DISABLE_PROMPT_CACHING_OPUS=1

The global setting overrides the per-model settings. Keep caching on for production, it cuts costs a lot on repeated context.

The opusplan strategy

If you want Opus-level reasoning without Opus-level bills, the opusplan model alias does a hybrid:

claude --model opusplan

With opusplan, Claude runs Opus during plan mode for reasoning and architecture calls, then switches to Sonnet for the code generation and implementation. You get Opus quality where it counts (planning) without paying Opus rates for every line of code.

This is one of the most effective cost moves if you use planning mode regularly.

When Things Go Wrong

Getting close to the limit? Switch models and compact:

/model sonnet
/compact

Hit a rate limit? Wait for the hourly reset, batch requests instead of firing them rapidly, or move up a plan tier.

Related Pages

  • Install ccusage and run ccusage daily --breakdown
  • Context management for less token waste
  • Model selection for your workflow
  • Troubleshooting tips to avoid expensive debugging

Track weekly and adjust from the data. Most developers cut costs 40-70% with these moves.

More in this guide

  • Agent Fundamentals
    Five ways to build specialized agents in Claude Code, from sub-agents to .claude/agents/ definitions to perspective prompts.
  • Agent Patterns
    Orchestrator, fan-out, validation chain, specialist routing, progressive refinement, and watchdog. Six ways to wire sub-agents in Claude Code.
  • Agent Teams Best Practices
    Battle-tested patterns for Claude Code agent teams. Troubleshooting, limitations, plan mode quirks, and fixes shipped from v2.1.33 through v2.1.45.
  • Agent Teams Controls
    Stop your agent team lead from grabbing implementation work. Configure delegate mode, plan approval, hooks, and CLAUDE.md for teams.
  • Agent Teams Prompt Templates
    Ten tested Agent Teams prompts for Claude Code. Code review, debugging, feature builds, architecture calls, and campaign research. Paste and go.

Stop configuring. Start building.

SaaS builder templates with AI orchestration.

Get Build This Now

Project Templates

Claude Code reads starter repos inside your terminal and stores the conventions it learns in CLAUDE.md for every session after.

Deep Thinking Techniques

Thinking trigger phrases like `think harder` and `ultrathink` push Claude Code into extended reasoning without switching models.

On this page

Claude Code Pricing
Commands That Cut Costs
Model switching with /model
Context control
Plan mode (Shift+Tab)
Track Your Usage
Cost-Saving Patterns
Environment Variables for Cost Control
Cut non-essential calls
Disable cost warnings
Prompt caching controls
The opusplan strategy
When Things Go Wrong
Related Pages

Stop configuring. Start building.

SaaS builder templates with AI orchestration.

Get Build This Now