Run Claude Code on a Cheaper Model: DeepSeek and GLM Cost Arbitrage

You can run Claude Code on a cheaper model by setting four environment variables that redirect it to DeepSeek V4's Anthropic-compatible endpoint, cutting your API bill 7 to 17x with near-identical coding quality. The catch: image analysis, MCP tool forwarding, task budgets, and the /ultrareview command silently stop working, and any setup using the old DeepSeek model names breaks for good on July 24, 2026. GLM-5.1 from ZhipuAI is a strong second option at about 8x cheaper output, and almost no guide mentions it.

Hören Sie auf zu konfigurieren. Fangen Sie an zu bauen.

SaaS-Builder-Vorlagen mit KI-Orchestrierung.

The short version

Claude Code talks to a model over Anthropic's API format. DeepSeek and GLM both publish endpoints that speak the same format. So you can swap the brain without changing the tool. You keep the Claude Code interface (the same agentic loop, the same file edits) but the requests go to a model that charges a fraction of the price.

Why this matters to you: if you code with Claude Code every day and your bill is creeping up, a swap to DeepSeek V4-Flash for routine work can take a heavy day from tens of dollars down to single digits. The trade is not raw quality. On standard coding benchmarks the gap is tiny. The trade is a handful of Claude-specific features that quietly go dark.

The four-variable setup

You do not need a proxy, a LiteLLM wrapper, or any local server. DeepSeek serves a native Anthropic Messages API endpoint, so Claude Code connects directly. Set these four environment variables:

ANTHROPIC_BASE_URL = https://api.deepseek.com/anthropic
ANTHROPIC_AUTH_TOKEN = your DeepSeek API key
ANTHROPIC_MODEL = deepseek-v4-pro (your main model)
ANTHROPIC_SMALL_FAST_MODEL = deepseek-v4-flash (the cheap model Claude Code uses for small background tasks)

Restart Claude Code and it now routes to DeepSeek. That is the whole setup.

For GLM, the same four variables apply. Point ANTHROPIC_BASE_URL at ZhipuAI's Anthropic-compatible endpoint and set the model names to the GLM-5.1 identifiers from your provider dashboard.

If you want a cleaner way to manage these per-project, you can keep them in a project file alongside your other Claude Code config like your CLAUDE.md instructions, so each repo can route to a different model.

The July 24, 2026 deadline (read this before you copy any old tutorial)

This is the trap that will bite people. The old DeepSeek model aliases deepseek-chat and deepseek-reasoner are retired on July 24, 2026 at 15:59 UTC. The correct identifiers are now deepseek-v4-pro and deepseek-v4-flash.

Any tutorial written before April 2026 uses the old names. After the deadline, a config with the old names does not throw a clear error. It can fail mid-session, leaving you confused about why your agent stopped responding. If you set this up, use the new names from day one.

Is the quality actually there?

Yes, on standard coding tasks. The benchmark numbers are close enough that you would not feel a difference on most work:

DeepSeek V4-Pro scores 80.6% on SWE-bench Verified (a test of real GitHub bug fixes). Claude Opus 4.6 scores 80.8%. That is a rounding error.
On Terminal-Bench 2.0 (live terminal tasks), DeepSeek V4-Pro actually beats Claude: 67.9% versus 65.4%.
GLM-5.1 reportedly hits 94.6% parity on Z.ai's own internal Claude Code coding eval.

So the cost saving is not you accepting worse code. On the common stuff (fixing bugs, writing functions, running commands) the cheaper models keep up.

The comparison table

Model	Input ($/M tokens)	Output ($/M tokens)	SWE-bench Verified	Native Anthropic endpoint	Known Claude Code breakages	Best for
DeepSeek V4-Flash	~$0.14	low	n/a (small model)	Yes	Same as V4-Pro	Cheap small/fast background tasks
DeepSeek V4-Pro	~$0.435 to $1.74 (reported)	comparable to GLM	80.6%	Yes	Images, MCP, task budgets, /ultrareview	Heavy agentic coding on a budget
GLM-5.1 (ZhipuAI)	~$1.00	~$3.20	n/a (94.6% parity on Z.ai eval)	Yes	Same Anthropic-feature gaps	Balanced cost and quality
Claude Opus 4.6 (baseline)	$5.00	higher	80.8%	Yes (it is Anthropic)	None	Full feature set, images, MCP

Prices vary by source and tier, so treat the dollar figures as reported ranges, not fixed quotes. DeepSeek V4-Flash is the cheapest overall. GLM-5.1 sits in the middle.

What silently breaks

This is the honest part. When you point Claude Code at a non-Anthropic endpoint, four features stop working and none of them throws a loud error. They just produce nothing:

Image analysis. Paste a screenshot and the model cannot read it.
MCP tool forwarding. Connections to MCP servers (the plugins that give Claude Code extra tools) stop being passed through.
Task budgets. The spend-limit controls do not apply.
/ultrareview. The deep review command does not run.

Two more confirmed issues:

Streaming tool calls break on NVIDIA NIM. The agent emits a tool call and a stop signal but never records the tool result, so the loop halts. This is a confirmed open bug with no published workaround.
An auth bug in Claude Code versions 2.1.128 through 2.1.131 triggers rejection on DeepSeek's endpoint because of a metadata.user_id serialization issue. Use a version outside that range.

If your workflow leans on images, MCP servers, or those review commands, the swap will quietly cost you those. For plain code editing and terminal work, you lose nothing.

The cost math

One documented case: a day with 412 tool calls of heavy agentic work cost under $7 through DeepSeek. For comparison, Claude Max 5x is $100 per month, which breaks even at $3.33 per day of equivalent API usage. If your daily usage clears that line, the cheaper model is the obvious move.

A practical split many builders use: route everyday coding to DeepSeek V4-Flash or V4-Pro, and keep a Claude subscription on standby for the work that needs images, MCP, or the review commands. You get the cheap bill for 90% of the work and full features when you actually need them.

Where Build This Now fits

If you want a setup that already ships production apps (not snippets) on top of Claude Code, the Build This Now Code Kit is a $29 one-time build system: a production SaaS skeleton with auth, Stripe payments, and PostgreSQL with row-level security on every table, plus the agents, skills, and hooks wired in. It runs on Claude Code, so the model-routing trick above applies the same way. Route the cheap model for bulk work, keep Claude for the feature-heavy steps.

FAQ

How do I use DeepSeek with Claude Code?

Set ANTHROPIC_BASE_URL=https://api.deepseek.com/anthropic, ANTHROPIC_AUTH_TOKEN to your DeepSeek key, ANTHROPIC_MODEL=deepseek-v4-pro, and ANTHROPIC_SMALL_FAST_MODEL=deepseek-v4-flash. No proxy needed. Restart Claude Code and it routes to DeepSeek.

Does DeepSeek work as well as Claude for coding?

On SWE-bench Verified, DeepSeek V4-Pro (80.6%) and Claude Opus 4.6 (80.8%) are effectively tied. DeepSeek V4-Pro actually outperforms Claude on Terminal-Bench 2.0 live terminal tasks (67.9% versus 65.4%). For standard coding, quality is not the trade-off.

What breaks when you use Claude Code with DeepSeek?

Image analysis, MCP tool forwarding, task budgets, and /ultrareview stop working silently. Streaming tool calls also break on NVIDIA NIM with no workaround. Claude Code versions 2.1.128 through 2.1.131 trigger auth errors due to a metadata serialization bug, so avoid those versions.

Is GLM cheaper than DeepSeek for Claude Code?

GLM-5.1 input pricing (~~$1.00/M) sits between DeepSeek V4-Flash ($0.14/M) and V4-Pro (reported $0.435 to $1.74/M). GLM-5.1 output (~~$3.20/M) is comparable to V4-Pro. DeepSeek V4-Flash is the cheapest option overall.

Will old DeepSeek tutorials still work?

No. The model names deepseek-chat and deepseek-reasoner retire on July 24, 2026 at 15:59 UTC. Use deepseek-v4-pro and deepseek-v4-flash. Old configs can fail mid-session after that date without a clear error.

Run Claude Code on a Cheaper Model: DeepSeek and GLM Cost Arbitrage

On this page