sdlcnext.com
← All posts
spec-driven-development AI methodology software-engineering

Spec-Driven Development: The Missing Link in AI Coding?

If unstructured prompting has plateaued at 10% productivity gains, the obvious next question is: what happens when you give AI better instructions? Spec-driven development is the most serious answer to that question so far.


Viewpoint

The 10% productivity plateau from unstructured AI use raises an obvious question. If the problem is that AI tools get inconsistent results from inconsistent inputs, ad hoc prompts, unclear requirements, missing context, what happens when you fix the input?

Spec-driven development (SDD) is the most serious attempt to answer that question. The core idea: treat the specification, not the code, as the primary artifact. Write a detailed, structured spec first. Let the AI implement from it. Version-control the spec alongside (or instead of) the code.

The Thoughtworks Technology Radar called it “one of the most important practices to emerge in 2025.” GitHub Spec Kit accumulated 72,000+ stars in roughly six months. AWS has shipped a commercial IDE built around the concept. This is moving fast.

What spec-driven development actually means

SDD is not a single workflow. It is a spectrum of practices defined by how much authority the spec has over the code.

Spec-first is the most common form today. The developer writes a detailed specification, the AI implements it, and the developer reviews and iterates. The spec guides generation. GitHub Spec Kit and AWS Kiro operate here.

Spec-anchored goes further: the spec both guides generation and validates the output, with automated checks verifying that the AI-generated code satisfies the spec’s constraints. This is closer to contract-driven or BDD-style workflows applied to AI-assisted development.

Spec-as-source is the radical end. Code becomes disposable output; only the spec is maintained. If you need to change behaviour, you change the spec and regenerate. Tessl is exploring this in private beta. At scale, it remains largely theoretical.

For most teams today, “spec-driven” means spec-first: structured markdown documents that give AI agents enough context to produce consistent, reviewable output.

The SDD spectrum: from spec-first to spec-as-source

The four frameworks

The SDD ecosystem has consolidated around four main approaches, each with a distinct philosophy.

GitHub Spec Kit (72K+ stars) is the de facto standard. Its workflow is Specify, Plan, Tasks, Implement, a lightweight iterative loop that stays close to agile practice. Agent-agnostic, supporting 22+ AI platforms, with a markdown-first philosophy: low overhead, high portability. Best for small to medium teams who want to move fast without adopting a heavy process.

BMAD Method is the enterprise end: 21 specialised AI agents and 50+ guided workflows simulating a full agile team, with AI personas for product manager, architect, developer, and QA. Best for large greenfield projects where governance and role clarity matter. The risk Thoughtworks flags is real: that much process can start to feel like waterfall.

OpenSpec goes the other direction and is designed specifically for brownfield environments. Rather than imposing a greenfield process on an existing codebase, it uses change-centric plain markdown specs: you write a spec for each change, not a full system spec. Best for teams maintaining legacy codebases who cannot start from scratch.

AWS Kiro is the commercial entrant. At $20/month, it bakes SDD directly into an IDE: natural language input becomes user stories, acceptance criteria, design documents, and implementation tasks automatically. Best for AWS-integrated enterprise teams who want the approach without the setup cost. The trade-off is lock-in to both the IDE and AWS’s interpretation of the workflow.

Four frameworks, four philosophies

The case for SDD

The logical argument for spec-driven development is strong. It addresses the specific failure modes that the research has documented.

GitClear found code churn rising to 5.7% and duplicate code growing 4x as AI adoption increased. The proposed mechanism: AI tools without clear specs are guessing at intent, producing code that frequently needs revision or duplication. A well-written spec removes the ambiguity that causes guessing.

DORA identified version-controlled audit trails as a success factor for high-performing teams. Spec files in version control create exactly that, a record of intent that can be reviewed, diffed, and referenced in post-mortems.

The Specify, Plan, Task, Implement workflow enforces incremental delivery, preventing the big-batch failures that DORA associates with AI-driven instability.

And the market signal is real: 72,000 GitHub stars in six months, enterprise adoption via AWS Kiro, Thoughtworks’s endorsement. When this many senior practitioners converge on a practice, it is worth taking seriously.

SDD: logical case vs empirical evidence

The honest assessment

The logical case is compelling. The empirical case does not yet exist.

No peer-reviewed, independently conducted study has quantified spec-driven development’s impact on productivity, code quality, or delivery speed. The claims of “10x performance improvement” circulating in the community are self-reported and often vendor-adjacent. They should be treated the same way the METR researchers treated vendor productivity claims: as hypotheses, not evidence.

Thoughtworks itself, while endorsing the practice, flagged a real risk: SDD’s emphasis on upfront specification can revert to waterfall antipatterns, heavy documentation requirements, big-bang releases, and the accumulated coordination costs that the agile movement spent two decades escaping. The lightweight, iterative end of the spectrum (Spec Kit, OpenSpec) is designed to avoid this. The heavyweight end (BMAD with 21 agents and 50 workflows) risks it.

The parallel to agile is instructive. The logic of iterative, feedback-driven development was sound in 2001. But how teams implemented it varied enormously, and the failures were usually process failures, not conceptual ones. SDD faces the same test. The idea is right. Execution will determine whether it delivers on the premise.

What to watch

SDD is where the live methodological questions in AI-assisted development are being worked out in 2026. The questions that the next 12 to 18 months of data should answer:

Does spec quality correlate with output quality in measurable ways? Can you demonstrate, with objective metrics, that teams using structured specs produce lower churn rates and fewer incidents than teams using ad hoc prompting?

Does SDD actually avoid the waterfall trap in practice, or does the discipline required to maintain good specs become its own kind of overhead?

And most importantly: at what team size and codebase complexity does SDD deliver enough value to justify the investment in process?

Apply the same scepticism here that the research demands everywhere else. Demand measured outcomes. The idea is promising. The evidence is still being collected.


Sources: Thoughtworks Technology Radar (2025). Piskala, arXiv (2026). GitHub Spec Kit. BMAD-METHOD. Fission-AI/OpenSpec. AWS Kiro. GitClear AI Copilot Code Quality Research. DORA 2025.