Build This Now
Build This Now
Echte BuildsState of Claude Code 2026: What 2,500 Public Repos RevealBauen ist nicht mehr der FlaschenhalsDistribution ist der neue BurggrabenWarum QA das eigentliche Nadelöhr in der KI-Entwicklung istFirst Principles im Zeitalter der 24-Stunden-MVPsDie Autonomie-Kurve: Wie viel Freiheit darfst du einem KI-Agenten geben?Von der Idee zum SaaSGAN LoopSelf-Evolving HooksTrace to SkillDistribution AgentsKI-Sicherheits-AgentsAutonomer KI-SchwarmKI-E-Mail-SequenzenKI räumt sich selbst aufAgent Swarm OrchestrationEine komplette App mit Claude Code bauen: Echte BeispieleClaude Code für Nicht-Entwickler: Echte BeispieleClaude Code for Freelancers: Ship 3x FasterA Security Update from Build This NowThe AI Agent That Deleted a Production Database in 9 SecondsHow to Build Your Own Claude Code Harness (or Buy One)Run Claude Code on a Cheaper Model: DeepSeek and GLM Cost ArbitrageIs Claude Code Just a Thin Wrapper? Inside the Harness DebateHow Much Does It Really Cost to Build a SaaS with Claude Code?How to Cut Your Claude Code Token Bill in HalfDo I Still Need a Boilerplate If I Use Claude Code?Harness vs Boilerplate vs Framework: The Build-System Stack ExplainedHow Long Does Idea to Production Actually Take with Claude Code?Is Vibe Coding Safe? What the Lovable and Moltbook Breaches TeachOwn Your Vercel Analytics: I Built a Drain-to-Postgres PipelineSpec-Driven Development Explained: Why Pros Stopped Vibe CodingState of Vibe-Coded SaaS Security (2026 Data)From Vibe Coding to Production: The Checklist That Stops Data LeaksVibe Coding vs Vibe Engineering vs Agentic Engineering: The 2026 GlossaryWhat Is an Agent Harness? Why the Harness, Not the Model, Is the 2026 Moat
speedy_devvkoen_salo
Blog/Real Builds/Run Claude Code on a Cheaper Model: DeepSeek and GLM Cost Arbitrage

Run Claude Code on a Cheaper Model: DeepSeek and GLM Cost Arbitrage

Point Claude Code at DeepSeek or GLM to cut your bill 7 to 17x. Setup, what breaks, and the July 2026 model-name change explained.

Hören Sie auf zu konfigurieren. Fangen Sie an zu bauen.

SaaS-Builder-Vorlagen mit KI-Orchestrierung.

Published Jun 22, 20267 min readReal Builds hub

You can run Claude Code on a cheaper model by setting four environment variables that redirect it to DeepSeek V4's Anthropic-compatible endpoint, cutting your API bill 7 to 17x with near-identical coding quality. The catch: image analysis, MCP tool forwarding, task budgets, and the /ultrareview command silently stop working, and any setup using the old DeepSeek model names breaks for good on July 24, 2026. GLM-5.1 from ZhipuAI is a strong second option at about 8x cheaper output, and almost no guide mentions it.


Hören Sie auf zu konfigurieren. Fangen Sie an zu bauen.

SaaS-Builder-Vorlagen mit KI-Orchestrierung.


The short version

Claude Code talks to a model over Anthropic's API format. DeepSeek and GLM both publish endpoints that speak the same format. So you can swap the brain without changing the tool. You keep the Claude Code interface (the same agentic loop, the same file edits) but the requests go to a model that charges a fraction of the price.

Why this matters to you: if you code with Claude Code every day and your bill is creeping up, a swap to DeepSeek V4-Flash for routine work can take a heavy day from tens of dollars down to single digits. The trade is not raw quality. On standard coding benchmarks the gap is tiny. The trade is a handful of Claude-specific features that quietly go dark.

The four-variable setup

You do not need a proxy, a LiteLLM wrapper, or any local server. DeepSeek serves a native Anthropic Messages API endpoint, so Claude Code connects directly. Set these four environment variables:

  1. ANTHROPIC_BASE_URL = https://api.deepseek.com/anthropic
  2. ANTHROPIC_AUTH_TOKEN = your DeepSeek API key
  3. ANTHROPIC_MODEL = deepseek-v4-pro (your main model)
  4. ANTHROPIC_SMALL_FAST_MODEL = deepseek-v4-flash (the cheap model Claude Code uses for small background tasks)

Restart Claude Code and it now routes to DeepSeek. That is the whole setup.

For GLM, the same four variables apply. Point ANTHROPIC_BASE_URL at ZhipuAI's Anthropic-compatible endpoint and set the model names to the GLM-5.1 identifiers from your provider dashboard.

If you want a cleaner way to manage these per-project, you can keep them in a project file alongside your other Claude Code config like your CLAUDE.md instructions, so each repo can route to a different model.

The July 24, 2026 deadline (read this before you copy any old tutorial)

This is the trap that will bite people. The old DeepSeek model aliases deepseek-chat and deepseek-reasoner are retired on July 24, 2026 at 15:59 UTC. The correct identifiers are now deepseek-v4-pro and deepseek-v4-flash.

Any tutorial written before April 2026 uses the old names. After the deadline, a config with the old names does not throw a clear error. It can fail mid-session, leaving you confused about why your agent stopped responding. If you set this up, use the new names from day one.

Is the quality actually there?

Yes, on standard coding tasks. The benchmark numbers are close enough that you would not feel a difference on most work:

  • DeepSeek V4-Pro scores 80.6% on SWE-bench Verified (a test of real GitHub bug fixes). Claude Opus 4.6 scores 80.8%. That is a rounding error.
  • On Terminal-Bench 2.0 (live terminal tasks), DeepSeek V4-Pro actually beats Claude: 67.9% versus 65.4%.
  • GLM-5.1 reportedly hits 94.6% parity on Z.ai's own internal Claude Code coding eval.

So the cost saving is not you accepting worse code. On the common stuff (fixing bugs, writing functions, running commands) the cheaper models keep up.

The comparison table

ModelInput ($/M tokens)Output ($/M tokens)SWE-bench VerifiedNative Anthropic endpointKnown Claude Code breakagesBest for
DeepSeek V4-Flash~$0.14lown/a (small model)YesSame as V4-ProCheap small/fast background tasks
DeepSeek V4-Pro~$0.435 to $1.74 (reported)comparable to GLM80.6%YesImages, MCP, task budgets, /ultrareviewHeavy agentic coding on a budget
GLM-5.1 (ZhipuAI)~$1.00~$3.20n/a (94.6% parity on Z.ai eval)YesSame Anthropic-feature gapsBalanced cost and quality
Claude Opus 4.6 (baseline)$5.00higher80.8%Yes (it is Anthropic)NoneFull feature set, images, MCP

Prices vary by source and tier, so treat the dollar figures as reported ranges, not fixed quotes. DeepSeek V4-Flash is the cheapest overall. GLM-5.1 sits in the middle.

What silently breaks

This is the honest part. When you point Claude Code at a non-Anthropic endpoint, four features stop working and none of them throws a loud error. They just produce nothing:

  • Image analysis. Paste a screenshot and the model cannot read it.
  • MCP tool forwarding. Connections to MCP servers (the plugins that give Claude Code extra tools) stop being passed through.
  • Task budgets. The spend-limit controls do not apply.
  • /ultrareview. The deep review command does not run.

Two more confirmed issues:

  • Streaming tool calls break on NVIDIA NIM. The agent emits a tool call and a stop signal but never records the tool result, so the loop halts. This is a confirmed open bug with no published workaround.
  • An auth bug in Claude Code versions 2.1.128 through 2.1.131 triggers rejection on DeepSeek's endpoint because of a metadata.user_id serialization issue. Use a version outside that range.

If your workflow leans on images, MCP servers, or those review commands, the swap will quietly cost you those. For plain code editing and terminal work, you lose nothing.

The cost math

One documented case: a day with 412 tool calls of heavy agentic work cost under $7 through DeepSeek. For comparison, Claude Max 5x is $100 per month, which breaks even at $3.33 per day of equivalent API usage. If your daily usage clears that line, the cheaper model is the obvious move.

A practical split many builders use: route everyday coding to DeepSeek V4-Flash or V4-Pro, and keep a Claude subscription on standby for the work that needs images, MCP, or the review commands. You get the cheap bill for 90% of the work and full features when you actually need them.

Where Build This Now fits

If you want a setup that already ships production apps (not snippets) on top of Claude Code, the Build This Now Code Kit is a $29 one-time build system: a production SaaS skeleton with auth, Stripe payments, and PostgreSQL with row-level security on every table, plus the agents, skills, and hooks wired in. It runs on Claude Code, so the model-routing trick above applies the same way. Route the cheap model for bulk work, keep Claude for the feature-heavy steps.

FAQ

How do I use DeepSeek with Claude Code?

Set ANTHROPIC_BASE_URL=https://api.deepseek.com/anthropic, ANTHROPIC_AUTH_TOKEN to your DeepSeek key, ANTHROPIC_MODEL=deepseek-v4-pro, and ANTHROPIC_SMALL_FAST_MODEL=deepseek-v4-flash. No proxy needed. Restart Claude Code and it routes to DeepSeek.

Does DeepSeek work as well as Claude for coding?

On SWE-bench Verified, DeepSeek V4-Pro (80.6%) and Claude Opus 4.6 (80.8%) are effectively tied. DeepSeek V4-Pro actually outperforms Claude on Terminal-Bench 2.0 live terminal tasks (67.9% versus 65.4%). For standard coding, quality is not the trade-off.

What breaks when you use Claude Code with DeepSeek?

Image analysis, MCP tool forwarding, task budgets, and /ultrareview stop working silently. Streaming tool calls also break on NVIDIA NIM with no workaround. Claude Code versions 2.1.128 through 2.1.131 trigger auth errors due to a metadata serialization bug, so avoid those versions.

Is GLM cheaper than DeepSeek for Claude Code?

GLM-5.1 input pricing ($1.00/M) sits between DeepSeek V4-Flash ($0.14/M) and V4-Pro (reported $0.435 to $1.74/M). GLM-5.1 output ($3.20/M) is comparable to V4-Pro. DeepSeek V4-Flash is the cheapest option overall.

Will old DeepSeek tutorials still work?

No. The model names deepseek-chat and deepseek-reasoner retire on July 24, 2026 at 15:59 UTC. Use deepseek-v4-pro and deepseek-v4-flash. Old configs can fail mid-session after that date without a clear error.

More in Real Builds

  • KI räumt sich selbst auf
    Drei overnight Claude Code-Workflows, die das Chaos der KI selbst bereinigen: slop-cleaner entfernt toten Code, /heal repariert kaputte Branches, /drift erkennt Pattern-Drift.
  • Agent Swarm Orchestration
    Four infrastructure layers that stop agent swarms from double-claiming tasks, drifting on field names, and collapsing under merge chaos.
  • GAN Loop
    Ein Agent generiert, einer reißt ihn auseinander, sie loopen bis der Score nicht mehr steigt. GAN Loop Implementierung mit Agent-Definitionen und Rubrik-Templates.
  • Die Autonomie-Kurve: Wie viel Freiheit darfst du einem KI-Agenten geben?
    Wie viel Autonomie du einem KI-Agenten geben kannst, hängt an einer einzigen Sache: wie lange ein Modell eine Aufgabe hält, ohne abzudriften. Ein gutes Gerüst plus ein zuverlässiges Modell macht echte Agentenarbeit erst möglich.
  • The AI Agent That Deleted a Production Database in 9 Seconds
    An AI deleted PocketOS's production database and all backups in 9 seconds. Here is why it happened and the guardrails that prevent it.
  • KI-E-Mail-Sequenzen
    Ein Claude Code-Befehl erstellt 17 Lifecycle-E-Mails über 6 Sequenzen, verkabelt Inngest-Verhaltenstrigger und liefert einen verzweigten E-Mail-Funnel bereit zum Deployment.

Hören Sie auf zu konfigurieren. Fangen Sie an zu bauen.

SaaS-Builder-Vorlagen mit KI-Orchestrierung.

How to Build Your Own Claude Code Harness (or Buy One)

How to build a Claude Code harness from scratch, what it costs in time, and when buying a pre-built kit ships faster.

Is Claude Code Just a Thin Wrapper? Inside the Harness Debate

Is Claude Code a thin wrapper around the Claude API? No. A leaked 512,000-line source shows a full orchestration harness. Here is what it adds.

On this page

The short version
The four-variable setup
The July 24, 2026 deadline (read this before you copy any old tutorial)
Is the quality actually there?
The comparison table
What silently breaks
The cost math
Where Build This Now fits
FAQ
How do I use DeepSeek with Claude Code?
Does DeepSeek work as well as Claude for coding?
What breaks when you use Claude Code with DeepSeek?
Is GLM cheaper than DeepSeek for Claude Code?
Will old DeepSeek tutorials still work?

Hören Sie auf zu konfigurieren. Fangen Sie an zu bauen.

SaaS-Builder-Vorlagen mit KI-Orchestrierung.