Thursday, April 23, 2026

Spec-Driven Development: What It Is and Why AI Makes It Non-Negotiable

Jonas

Beginner7 min read

Requirements Engineering Behavior-Driven Development Spec-Driven Development

Writing down what software should do before you build it is not a new idea. It's how software engineering was taught. It's what requirements documents, design specs, and BDD scenarios have always been about. The idea that you should know what you're building before you build it has been around longer than most of the frameworks we argue about on the internet.

So when someone says "Spec-Driven Development," the skeptical developer response is: isn't that just... development? Wasn't that always the plan?

Yes. And also no. Because there's a version of specs that guides humans, and there's a version that commands machines. Those are not the same thing. And if you're using AI coding tools (and at this point, most of us are), you're operating in the second world whether you've named it that or not.

Writing it down first is not a revolutionary idea

Spec-Driven Development (SDD) is a development approach where a written specification (not a vague idea, not a Slack thread, not a comment in a PR) is the source of truth for what gets built. You write it before implementation starts. You update it when the requirements change. You use it to validate the output.

That's it. That's the whole thing. If you've been running proper BDD for years, you're most of the way there. Most teams aren't.

The spec can take different forms. A plain prose description of a feature with explicit edge cases. A user story with Gherkin acceptance criteria. A feature file. A structured requirements document. The format is less important than the habit: there is a written artifact that says precisely what the software should do, and the code is an answer to that artifact, not the other way around.

In practice, most teams do a broken version of this. They have a ticket. The ticket has a title and maybe a one-paragraph description. Someone writes the code. In review, someone asks "wait, what happens when the user is not logged in?" and that question gets answered in a comment thread instead of in the spec. Then the next ticket has the same problem. And the one after that.

SDD is what happens when you close that loop before it opens.

The spec used to be a promise - now it's a command

Here's where something actually changed.

When a human developer reads a spec, they fill in gaps. They make reasonable assumptions. They ping a product owner, read between the lines, and use professional judgment to handle the cases the spec didn't cover. The spec is a starting point for a human who brings context.

An AI coding agent doesn't do any of that.

Give an AI agent an underspecified task and it will complete it - confidently, quickly, and wrong in ways you won't notice until the demo. It won't ask what happens when the email address already exists. It won't flag that two requirements are contradictory. It will build exactly what you described, including all the ambiguities, and ship you something that looks correct from the outside and fails at the edges.

The spec used to be a promise between you and a developer. Now it's a command to a system that executes literally. The developer who filled in your gaps is not in the loop anymore. Your spec is.

There's a deeper point. Dan North, who invented BDD, has argued for years that code is entirely liability, not asset. It costs money to write, money to maintain, and money to delete. What has value is the business behaviour the code implements. The spec is the direct expression of that behaviour. The code is just one possible implementation of it, and increasingly a disposable one.

If an AI agent rewrites your login module tomorrow in a different framework, you haven't lost anything that matters, as long as the spec is accurate. Code without a spec is a black box. A spec without code is still the full description of what your software is supposed to do - you can rebuild from it. The reverse isn't true.

That's why SDD matters now, specifically. Not because writing specs is suddenly a new idea. Because you've handed implementation to a machine you don't fully control, and the spec is the one thing you can still own.

The Spec node is the pivot because everything after it is derived from it - and everything that changes should flow back to it. A spec that stops being updated is just documentation.

Three things SDD is not

It is not waterfall.

There's a version of this criticism worth taking seriously. If you write a spec, hand it to a developer, and never touch it again, that's waterfall. The spec becomes a gate instead of a source of truth, and by the end of the sprint it describes a system that no longer exists.

The difference isn't the format. It's whether you treat the spec as a contract you wrote once or as the primary artifact you maintain. A living spec that gets updated when requirements change is the opposite of waterfall - it's what waterfall was always trying to be and never managed. That feedback loop from Validate back to Spec isn't optional. Skip it, and you've built a gate. Maintain it, and you've built a working system.

A three-scenario Gherkin feature file for a single story takes ten minutes to write. The waterfall comparison only holds if you're writing 200 pages of requirements before anyone talks to a user. That is not what this is.

It is not a document that gets written once.

This is the mistake that kills SDD in practice. Teams write a spec, use it to kick off development, and then stop maintaining it. By the end of the sprint the spec and the code have diverged. The next developer who reads the spec is now reading a historical artifact, not a description of how the system works.

A spec is only useful when it's accurate. If the spec and the code have diverged, you're not doing SDD. You're doing documentation theater.

It is not "no more prompting."

SDD doesn't eliminate the back-and-forth between you and your AI coding tool. It structures it. Instead of a chat history that only lives in one session, you have a persistent artifact that any agent or team member can read and act on. The prompting still happens. The difference is you're prompting from a source of truth, not from memory.

This is also why a good spec is the best form of a Memory Bank. That's the pattern where you maintain persistent context files (MEMORY.md, AGENTS.md, or equivalent) that your coding agent loads at the start of every session. Most teams treat those files as a scratchpad for tech stack decisions and coding conventions. A well-maintained spec is more valuable: it carries the why behind every behaviour, not just the how of the implementation. An agent that reads a spec before writing code has more relevant context than one that reads a list of rules.

What a spec actually looks like when it drives an agent

There's no settled standard for what a spec looks like in SDD. Most tools in this space - GitHub Spec Kit, Kiro, Tessl - use prose text. Natural language descriptions of requirements, user stories, constraints. The format is still evolving, and what works best probably depends on the type of feature and the tool you're using.

My take: for specs that drive coding agents, precision is the only thing that matters. And the most precise format for describing software behaviour I've seen is Gherkin: Given/When/Then scenarios. Not because of the BDD tooling around it, but because the structure forces you to specify a precondition, an action, and an expected outcome. That's it. No ambiguity left to fill in.

You're building a login feature. You open your AI agent and write:

"Add login functionality. Users should be able to log in with their email and password."

The agent builds it. It works for the happy path. Users can log in. What the agent didn't know: accounts lock after five failed attempts, unverified email addresses should see a different error, and password reset lives in a separate flow that hasn't been built yet.

None of those constraints were in the prompt. They were in someone's head. Now they're missing from the code.

Here's the same requirement written as Gherkin scenarios:

Feature: User login Scenario: Successful login with verified account Given a user with a verified email address and a valid password When they submit the login form Then they are redirected to the dashboard Scenario: Login attempt with unverified email Given a user who has not verified their email When they submit the login form Then they see a message prompting them to check their inbox And they are not logged in Scenario: Account locked after failed attempts Given a user who has failed login 5 times When they attempt to log in again Then they see a message that their account is temporarily locked And they are shown a link to reset their password

That's three scenarios. Fifteen minutes of thinking. You feed that to your agent and it knows what to build, edges included. The agent didn't get smarter. You gave it better input.

The variable that determines whether your agent builds the right thing

It's not the model. It's not the prompt style. It's not the framework.

It's the spec.

Every coding agent right now is a precision instrument. What you give it is what it builds. The teams shipping clean features with AI are not using better agents, they're using better specs. They've done the thinking before the agent starts, not during code review after it finishes.

SDD is the practice of treating that thinking as a first-class artifact. Write it down, keep it accurate, and use it as the single source of truth between what you intend and what gets built. This isn't a methodology invented for AI. It's a discipline that AI makes unavoidable.

SDD tools assume you arrive with a spec. If you need help producing that spec, whether from a product idea, a user journey, or a domain conversation, that's what Speclr is for. Start your first project

Spec-Driven Development: What It Is and Why AI Makes It Non-Negotiable

Writing it down first is not a revolutionary idea

The spec used to be a promise - now it's a command

Three things SDD is not

What a spec actually looks like when it drives an agent

The variable that determines whether your agent builds the right thing

Tags

Product

Legal