BDD doesn't fail at the scenario. It fails before the first Given.

I've watched a team spend three months doing BDD correctly - and still deliver the wrong product.
Three Amigos sessions, every sprint. Gherkin scenarios written before implementation. Developers, testers, and a product owner in the same room, arguing through edge cases before anyone opened an IDE. The process was tight. The coverage was solid. The demos were clean.
Then the steering committee reviewed the quarter's output and said: this isn't what we needed.
The team hadn't built it wrong. They'd built the wrong thing. With higher precision than most teams build the right thing.
That's a failure mode BDD doesn't protect you from - and the instinct is to blame the process. The process wasn't the problem.
What BDD actually does - including the part people forget
BDD is not just a test format. That's worth saying clearly, because most developer-facing descriptions of BDD lead with Gherkin and work backwards.
The actual practice of BDD includes a discovery phase. Three Amigos - developer, tester, product owner - sit down before a story enters implementation. Example Mapping is a structured session for working through rules, examples, and open questions. These are real collaborative techniques. They surface assumptions. They force concrete behavior before abstract features. Done well, they prevent a specific class of failure: the kind where a developer implements one interpretation and a PO expected another.
BDD's discovery is legitimate and underused. Most teams that adopt BDD adopt the Gherkin syntax and skip the discovery entirely. If you're doing that, you're using BDD wrong - and the arguments in this article don't apply to you yet. Get the Three Amigos sessions working first.
This matters because any honest argument about BDD's limits has to start by acknowledging what BDD does well. BDD closes the gap between intent and implementation at the story level. That gap is real, and closing it matters.
But there's another gap. And BDD doesn't see it.
The boundary BDD operates inside
Here's the thing Three Amigos can't fix: the scope of the session itself.
That's not a critique of the method. Three Amigos is explicitly designed to help teams build the right product - and it delivers on that at the story level. But there's a difference between "right behavior for this feature" and "right scope for this system." Three Amigos answers the first question well. The second has to be answered before anyone schedules the session.
Three Amigos works inside a frame that was set before anyone arrived in the room. Which stories are on the board, what the system does, who it serves - those decisions were made upstream. Which users exist, what their goals are, where the system boundary falls - all of that was settled, or assumed to be settled, before the first Three Amigos invite went out.
BDD refines behavior within that frame. It doesn't examine the frame.
When a story says "user can filter projects by owner," a Three Amigos session can produce excellent scenarios: active projects, archived projects, no matching results, permission edge cases. The scenarios will be precise and complete.
What Three Amigos won't ask: should this filtering feature exist? Who is this "owner" exactly - the person who created the project, or an explicitly assigned role? If it's assigned, who assigns it, and is there a permission model for that? Is filtering by owner what this user actually needs, or a proxy for something else?
Those aren't scenario questions. They're structural questions about the system. They belong to a different conversation, in a different phase, before any story was written.
Scenario: Filter projects by owner, active projects only
Given I am logged in as a project member
When I filter the project list by owner "Maria"
Then I see only active projects where Maria is the assigned owner
This scenario is well-formed. It's testable. It's unambiguous. And if "assigned owner" turns out to be a concept that doesn't exist in your domain model - if everyone assumed owner meant something and nobody made it explicit - then this scenario is a precise description of behavior the system was never designed to support.
The scenario didn't fail. The frame failed.
What lives upstream
The questions BDD can't ask are not mysterious. They're standard requirements engineering questions:
Who are the actual users - with defined roles and attributes, not just assumed ones? What are they trying to accomplish at a strategic level, not a feature level? What does the system do, and what does it deliberately not do? What constraints apply before any feature is discussed?
These don't require a 200-page PRD. They require a structured conversation that happens before the backlog exists.
In the project I described at the start: the team was precise. The scenarios were excellent. But nobody had ever formally defined what a "project owner" was in the context of that system. It was assumed. The assumption was different for different stakeholders. Three Amigos sessions ran for a quarter without catching it, because every session started from a story that treated the concept as settled.
It wasn't caught in BDD's discovery phase because BDD's discovery phase doesn't question the vocabulary of the stories. It operates inside that vocabulary. The moment you write Given a project with an assigned owner, BDD treats "assigned owner" as a given. It's not BDD's job to ask whether that entity is defined. That job belongs upstream.
Three Amigos defines behavior within a scope. Who defines the scope - and whether it's the right one - is a question that happens upstream, before the first user story is written.
The failure mode nobody names
There's a specific failure shape that comes from skipping the upstream work, and it's worth naming precisely.
The team is professional. The delivery is clean. The process was followed. And the project still fails - or lands something that doesn't match what the business needed - because strategic decisions were never made explicit.
The developers communicated. BDD was done correctly. The failure was simpler and harder to fix: scope, user definitions, system vocabulary - assumed, never established. And assumptions that live in different people's heads are different assumptions.
BDD makes this failure visible sooner than unstructured development does, and that matters. A Three Amigos session that hits a fundamental disagreement about what "owner" means in a multi-tenant system is catching something important. It's just catching it at the story level - mid-sprint, when the story is already on the board - instead of at the strategic level, where it could have been resolved before the entire backlog was built around the wrong model.
The cost difference is significant. A disagreement resolved in a discovery workshop takes an afternoon. The same disagreement surfaced in sprint three, after twenty stories written against the wrong domain model, takes a quarter.
The pattern: BDD catches behavior mismatches. It doesn't catch scope mismatches. Those have to be caught upstream - before the first story is written, before the first Given is typed.
Discovery is not a phase you skip to go faster
The counterargument I hear most often: we move too fast for upfront discovery. We'll discover as we go. BDD will surface the issues.
It will surface some of them - the ones at the behavior level, where two people in a Three Amigos session disagree about what happens when a form field is empty.
It won't surface the ones at the strategy level. Those don't show up as disagreements in a scenario review. They show up at the demo, when a stakeholder gives a politely confused response and nobody's quite sure why. They show up when the backlog has to be rebuilt. When a project was executed with real discipline and still landed in the wrong place.
Discovery isn't upfront work in the waterfall sense. It's a structured conversation about what you're building before you decide how to build it. It can be fast. It can be lean. Speclr does this as a guided interview - one session, structured output. Defined users, defined flows, defined vocabulary, defined scope. Gherkin scenarios that derive from that spec rather than having to define the system while also describing its behavior.
BDD is a precision instrument. It needs a target. Discovery is how you set one.
A team running BDD without that upstream step is building the most accurate possible description of a system that might be solving the wrong problem. The scenarios pass. The tests are green. And you're still in the wrong forest - just with better maps.
Tags
Related posts
Three levels. One flat list. That's why refinement never ends.
Most teams use one level of requirements for everything - epics, stories, and scenarios all collapsed into the same text field. That's not a detail problem. It's a layering problem. Here's the model that makes the difference.
Spec-driven development starts one step too late
SDD tools like GitHub Spec Kit and AWS Kiro solve a real problem. But every one of them starts with a spec already in hand - and nobody asks how it got there.

