GitHub Spec Kit: Spec-Driven Development That Kills Vibe Coding
A hands-on guide to GitHub Spec Kit and the specify CLI. Install it, run the spec to plan to tasks to implement loop, and learn how to bring the same discipline to Claude Code.
Arrêtez de configurer. Commencez à construire.
Templates SaaS avec orchestration IA.
Problem: You throw "add photo sharing to my app" at an AI agent and it guesses at a thousand unstated requirements, dumps 1,200 lines of code, and you spend the next hour reviewing a thing you never specified. That is vibe coding, and it does not scale past a toy.
Quick Win: Install GitHub's Spec Kit and the specify CLI in one line, then let it scaffold a spec-driven workflow for your agent.
uv tool install specify-cli --from git+https://github.com/github/spec-kit.gitArrêtez de configurer. Commencez à construire.
Templates SaaS avec orchestration IA.
In 2026 the whole industry swung against vibe coding. GitHub open-sourced Spec Kit in September 2025, and by mid-2026 it had crossed 90,000 GitHub stars with 30-plus agent integrations, making spec-driven development the fastest-growing approach to AI coding (Visual Studio Magazine, May 12 2026). This is the hands-on version: install the tool, run the loop, read the artifacts, then port the same discipline to Claude Code.
What is spec-driven development?
Spec-driven development (SDD) treats the specification as the source of truth and the code as a downstream artifact generated to match it. You write what you want and why, the agent writes how, and you verify at each step instead of reviewing one giant code dump at the end.
The phrase GitHub uses is blunt: "Specifications don't serve code, code serves specifications." Specs become living, executable artifacts that evolve with the project rather than stale docs nobody reads (The GitHub Blog, Sep 2 2025).
The reason this beats vibe coding is structural. A vague prompt forces the model to guess at potentially thousands of unstated requirements. A spec removes the guessing. You already made the decisions, so the agent stops making them for you.
What is GitHub Spec Kit?
GitHub Spec Kit is an open-source toolkit that bolts a spec-driven workflow onto your existing AI coding agent. It ships two things: the specify CLI that bootstraps a project, and a set of templates plus slash commands that drive the spec, plan, tasks, and implement phases.
Spec Kit works with 30-plus agents across CLI tools and IDEs, including Claude Code, GitHub Copilot, Gemini CLI, Cursor CLI, and Codex CLI (github/spec-kit README). It is not an IDE and not a model. It is a discipline layer that sits on top of whatever agent you already use.
You need uv (the Python package manager from Astral) installed first. Then install the CLI from the repo:
uv tool install specify-cli --from git+https://github.com/github/spec-kit.gitTo scaffold a new project, run specify init and point it at your agent. The --ai flag selects the integration:
specify init my-photo-app --ai claudeIf you prefer a one-shot run without a permanent install, uvx works too:
uvx --from git+https://github.com/github/spec-kit.git specify init my-photo-appspecify init drops a .specify/ directory of templates and scripts into your project and registers the /speckit.* slash commands with your chosen agent. From here, everything happens inside the agent session.
How the Spec Kit workflow works
The workflow is four core phases that each produce a Markdown artifact feeding the next: specify, plan, tasks, implement. Each artifact gives your agent structured context instead of ad-hoc prompts.
Before you touch the four core phases, set the ground rules once with the constitution command. This writes immutable high-level principles (testing standards, architecture constraints, what the project will never do) that every later phase has to respect. Run it inside your agent session:
/speckit.constitution Use TypeScript everywhere. Every feature ships with tests. No feature touches the database without a migration. Prefer boring, well-supported libraries over clever ones.Phase one is specify. You describe the what and the why in plain language, no tech stack. The agent generates a detailed spec focused on user journeys and what success looks like:
/speckit.specify Build an app that lets a user organize photos into albums. Users can create albums, drag photos between albums, and view albums as a grid. No login for the MVP. Photos are stored locally. Success means a user can sort 200 photos into 5 albums in under two minutes.That command writes a spec file under specs/ containing user stories in "As a user, I want..." form, acceptance criteria, and edge cases. Read it. This is the cheapest place to catch a misunderstanding, before any code exists.
Phase two is plan. Now you bring the tech stack and constraints, and the agent produces the technical design: architecture, data model, API surface, and supporting files:
/speckit.plan Use Next.js 16 and React 19 on the frontend, Tailwind v4 for styling. Store photo metadata in PostgreSQL via Supabase. Keep image files in object storage. Drag-and-drop with dnd-kit. No external auth service for the MVP.Phase three is tasks. The agent breaks the spec and plan into small, reviewable chunks that each solve one specific piece, so you can validate them in isolation:
/speckit.tasksThis generates a tasks.md of numbered, ordered TODOs, each linked back to a requirement. You now have a build order, not a wish.
Phase four is implement. The agent executes the tasks and you review focused diffs that map to specific tasks instead of a single thousand-line dump:
/speckit.implementFor anything beyond a quick experiment, treat three more commands as quality gates. Run /speckit.clarify after specify to force the agent to surface underspecified areas and ask you about them. Run /speckit.checklist to generate validation checklists. Run /speckit.analyze after tasks to cross-check the spec, plan, and tasks for consistency before you implement. The full production order looks like this:
/speckit.constitution
/speckit.specify
/speckit.clarify
/speckit.checklist
/speckit.plan
/speckit.tasks
/speckit.analyze
/speckit.implementThe lean path for a throwaway prototype is just the four core commands (Spec Kit Quickstart).
What does a Spec Kit spec actually look like?
A spec file is plain Markdown the agent generated from your /speckit.specify prompt, structured so the plan phase can consume it. Here is the shape of the photo-album spec, trimmed to the load-bearing parts:
# Feature: Photo Albums
## User Stories
- As a user, I want to create a named album so I can group related photos.
- As a user, I want to drag a photo from one album to another so I can re-sort.
- As a user, I want to view an album as a grid so I can scan it quickly.
## Acceptance Criteria
- Creating an album with an empty name is rejected with an inline error.
- A photo always belongs to exactly one album; moving it removes it from the old one.
- The grid renders 200 photos without dropping below 50fps on a mid-range laptop.
## Edge Cases
- Album deleted while it contains photos: photos move to an "Unsorted" album, never deleted.
- Duplicate album names allowed, distinguished by creation timestamp.
- Drag interrupted (escape key or drop outside a target): photo returns to origin.
## Out of Scope (MVP)
- Authentication and multi-user accounts.
- Cloud sync.
- Sharing albums via link.Notice what is missing: no React, no SQL, no file paths. The spec is about behavior. The "Out of Scope" block is the most underrated part, because it is the line that stops the agent from inventing auth you never asked for.
Spec-driven development with Claude Code
You do not strictly need Spec Kit to run SDD. You can encode the same four phases as plain Markdown files and drive them with Claude Code directly, which is what a lot of teams settle on once the loop is muscle memory.
The minimal version: keep a SPEC.md per feature, write the requirements and acceptance criteria yourself or with Claude in plan mode, then start a fresh session to implement. Claude Code's first-attempt success rate on small-to-medium PRs sits around one-third without detailed guidance, and a reviewed spec is what closes that gap. Create the spec in one session:
# In a Claude Code session, draft and review the spec interactively
claude
> Read the photo-album feature idea below and write a SPEC.md with user stories,
> acceptance criteria, edge cases, and an explicit out-of-scope list. Ask me
> clarifying questions before writing anything.Once SPEC.md is reviewed and committed, open a clean session and point Claude at it so the implementation context is the spec, not your chat history:
claude
> Read SPEC.md and tasks.md. Implement task 1 only. Stop and show me the diff
> before moving on. Do not touch anything outside the files task 1 lists.If you want the Spec Kit slash commands inside Claude Code without scaffolding a separate repo, specify init --ai claude registers them directly. Either way the principle is identical: write the spec, review it while it is cheap, then let the agent build against a target it cannot misread. This is the same Spec → Plan → Tasks → Implement loop Martin Fowler documented across Kiro, Spec Kit, and Tessl, where "your role isn't just to steer, it's to verify" at each phase (martinfowler.com, Oct 15 2025).
This is also exactly how the Build This Now pipeline works. The /discover command researches your market and writes the product spec, /mvp-spec turns it into per-feature specs with build order, and /mvp-build runs the implementation. It is spec-driven development with a 9-agent build team and quality gates attached, so you get the discipline of Spec Kit plus the agents that actually ship the feature.
Spec Kit vs vibe coding: which wins?
For anything you intend to keep, spec-driven wins, and it is not close. Vibe coding is faster for the first 20 minutes and slower for every hour after, because you pay the review and rework cost at the end when changes are expensive.
The difference shows up in three places. Review surface: you read a focused diff per task instead of a 1,200-line wall. Misunderstandings: caught in a Markdown spec for free, instead of in shipped code for a refactor. Drift: the spec is a contract the agent re-reads, so it stops reinventing scope between sessions.
Vibe coding still has a place. Throwaway scripts, one-off data munging, and genuine exploration where you do not yet know what you want are all fine to vibe. The moment a human other than you will touch the code, or the moment it touches money or user data, write the spec first.
Frequently asked questions
Do I need Spec Kit, or can I just write a SPEC.md by hand?
Both work. Spec Kit gives you templates, the constitution and analyze quality gates, and slash commands that standardize the artifacts across a team. A hand-written SPEC.md driven through Claude Code gets you 80% of the value with zero setup. Start by hand, adopt Spec Kit when you have more than one person or more than a handful of features.
Which AI agents does Spec Kit support?
Spec Kit works with 30-plus agents as of mid-2026, including Claude Code, GitHub Copilot, Gemini CLI, Cursor CLI, and Codex CLI. You pick yours with the --ai flag at init time, for example specify init my-app --ai claude. Run specify integration list to see every integration your installed version supports (github/spec-kit README).
Isn't spec-driven development just waterfall with extra steps?
No. Waterfall locks the spec up front and forbids change. SDD treats the spec as a living artifact you revise between phases and between sessions. The loop is tight: specify, review, plan, review, implement, review. You are iterating on a cheap Markdown file instead of on expensive code.
How is this different from prompt engineering?
Prompt engineering optimizes a single message. SDD optimizes the whole context the agent builds against. The spec, plan, and tasks files persist across sessions, so a fresh agent starts with the same structured target instead of an empty window and your memory of what you meant.
What tools other than Spec Kit are doing spec-driven development in 2026?
The category exploded. AWS Kiro is an IDE built around the spec lifecycle, Tessl is a framework and registry that treats specs like npm packages, and the BMAD-METHOD is a community framework. Claude Code Skills also encode repeatable spec-driven workflows as reusable slash commands without a full Spec Kit setup (martinfowler.com, Oct 15 2025).
Posted by @speedy_devv
Arrêtez de configurer. Commencez à construire.
Templates SaaS avec orchestration IA.
Compound Engineering: The AI Loop Where Every Task Makes the Next Easier
Compound engineering is an AI coding loop (plan, build, review, compound) where every fix becomes a permanent lesson. Here is the method and how to set it up in Claude Code.
SWE-bench Is Lying: How DeepSWE Caught AI Agents Cheating
DeepSWE, a contamination-free benchmark from Datacurve, caught coding agents reading the gold fix from git history. Here are the 5 ways SWE-bench scores mislead you and what to test instead.