In the first post, I introduced AIRUP — my thesis that AI agents can resurrect the Rational Unified Process. I mentioned something called Specification-Driven Development as a key ingredient. Several people asked: "OK, but what is SDD exactly?" Fair. Let's fix that.
The Elevator Pitch
Specification-Driven Development is a paradigm where you write a structured specification before any code exists, and then AI agents execute against that specification. The spec is the source of truth. The code is a derivative.
If that sounds like "just writing requirements before coding" — congratulations, you've been doing engineering since the 1960s. The difference is the audience. Traditional requirements were written for humans who would interpret them, argue about them in meetings, and then ignore half of them. SDD specs are written for AI agents who will execute them literally, completely, and without passive-aggressive Slack messages.
This changes everything about how you write them.
Anatomy of an SDD Workflow
Most SDD implementations — whether it's AWS Kiro, GitHub Spec-Kit, or internal frameworks at companies — follow a three-phase structure:
Phase 1: Requirements. You describe what you want. User stories, acceptance criteria, edge cases, constraints. In Kiro, this produces a requirements.md. The spec uses structured notation — not "as a user I want…" handwaving, but precise, testable statements that an agent can parse unambiguously.
Phase 2: Design. An agent (or you) writes the technical architecture: component diagrams, sequence flows, data models, API contracts, error handling strategy. This produces a design.md. The key insight: the design references the requirements, creating traceability. Every design decision maps to a requirement.
Phase 3: Tasks. The agent decomposes the design into discrete, executable implementation tasks. Each task is atomic — one clear outcome, one clear validation. This produces a tasks.md that looks like a checklist a robot would love.
Phase 4: Code. The agent executes the tasks. Writes tests first (TDD), then implements, then validates against the original spec. If something doesn't match — the spec wins, not the code.
Why SDD Works Now (But Wouldn't Have Worked Before)
People have been writing specifications since before most of us were born. IEEE 830, formal methods, Design by Contract, Z notation — the graveyard of spec-first approaches is vast and well-populated. So why is SDD different?
Because the reader changed.
When the spec reader is a human developer, detailed specifications create overhead. The developer reads the spec, builds a mental model, then writes code from that mental model — not from the spec directly. The spec is a lossy intermediary. Worse: maintaining the spec alongside the code is double work that nobody wants to do. So specs rot, diverge, and eventually get abandoned.
When the spec reader is an AI agent, none of those problems exist:
- The agent has no cognitive fatigue. A 200-page spec is as easy to process as a 2-page spec.
- The agent executes directly from the spec. There's no lossy mental model in between.
- The agent can validate its own output against the spec at every step.
- The agent never complains that the spec is "too detailed." (A sentence I've never successfully said to a human developer.)
"The problem with specifications was never the specifications. It was that humans are bad at following them and worse at maintaining them."
SDD in the Wild
This isn't theoretical. SDD is already being used in production by major players:
AWS Kiro — An agentic IDE that generates three-phase specs (requirements, design, tasks) from natural language descriptions. It uses Claude models under the hood and tracks task completion in real time. Kiro explicitly positions itself around the idea that "specs are the bridge between intent and implementation."
GitHub Spec-Kit — An open-source toolkit (82K+ stars) that provides templates, validation tools, and agent integration for spec-driven workflows. It standardizes how specs are written so that different AI tools can consume them interoperably.
Google Antigravity — Google DeepMind's AI-first development environment, which uses agents.md and skills.md files to define multi-agent workflows driven by specifications. Multiple agents (PM, Engineer, QA, DevOps) operate in parallel, each consuming the spec from their role's perspective.
The pattern across all of them is identical: spec → agents → quality gates → code.
The Criticism (And Why It's Half Right)
SDD has vocal critics, and they make good points. François Zaninotto from Marmelab has called it "Waterfall Strikes Back" — arguing that writing lengthy markdown specs before coding is just waterfall with a trendy name. Martin Fowler's team at Thoughtworks has noted that static specs create a "false sense of control" that breaks down on contact with reality.
They're half right. Here's the half they're missing:
The criticism applies perfectly to static SDD — where you write a spec once, throw it over the wall to agents, and pray. That is waterfall. But nobody said specs have to be static. What if the spec is a living document inside an iterative process? What if agents update the spec as they discover edge cases? What if the spec evolves through phases — getting more detailed as uncertainty decreases?
And here's the real game-changer that most critics miss: SDD specs live in the repository, right next to the code. Not in a Confluence page that nobody updates. Not in a Google Doc that's three versions behind. In the actual git repo, versioned, diffable, reviewable in pull requests.
This changes the social contract around documentation entirely. Think about what happened when we put tests next to the code: committing a change that breaks a test became an obvious mistake — the CI pipeline screams at you, your PR gets blocked, your colleagues raise an eyebrow. SDD creates the same dynamic for specifications. A developer who changes a function's behavior and commits without updating the corresponding spec is making the same kind of mistake as someone who breaks a unit test and pushes anyway. The spec is no longer a bureaucratic artifact that lives in a separate tool — it's a first-class citizen of the codebase, subject to the same version control, the same code review, the same CI checks.
This is the leap. Not just "write specs for AI agents" — but make specs part of the codebase, with all the enforcement mechanisms we already have for code quality. The spec becomes as alive as the code, because it is the code's neighbor.
In other words: what if you put SDD inside an iterative, phase-gated process with built-in feedback loops?
Like, say... RUP?
The RUP Connection
And this is the punchline. RUP already solved the "static spec" problem — twenty years ago. RUP's four phases (Inception, Elaboration, Construction, Transition) are designed precisely to evolve artifacts iteratively. Requirements start rough in Inception, get refined in Elaboration, and are continuously validated in Construction. The spec is never "done" — it's progressively detailed.
SDD + RUP = Iterative Specs
SDD provides the format — structured, machine-readable specifications that AI agents can execute.
RUP provides the lifecycle — an iterative process that evolves those specs through phases with quality gates.
Together, they eliminate SDD's biggest weakness (static specs) and RUP's biggest weakness (human fatigue).
This is exactly what AIRUP proposes: take SDD's specification format, embed it in RUP's iterative lifecycle, and let AI agents do the heavy lifting. The AI Governor ensures the specs don't over-inflate, the quality gates ensure they stay honest, and the iterative phases ensure they evolve with reality.
SDD is not a silver bullet. It's not even a complete methodology. It's a format and a workflow pattern that happens to be exactly what AI agents need to do useful work. The real question is: what process governs those specs?
My answer — and the core of my thesis — is that the Rational Unified Process, the methodology we discarded for being "too formal," provides the ideal governance structure for specification-driven AI agents. The irony is delicious.
"SDD is the fuel. RUP is the engine. AI agents are the driver. And the AI Governor makes sure nobody drives off a cliff."
Next up: The AI Governor Pattern — how to prevent multi-agent systems from burning through your budget while debating whether a variable should be called userId or user_id.