Agile was built for a world where writing code was the bottleneck. It isn’t anymore.
AI assistants scaffold features in minutes. Test suites, PR descriptions, refactored modules: things that used to take days happen before lunch. The coding part got cheap. Everything downstream didn’t, and that gap is where teams are starting to get into trouble.
More PRs are being opened, which means longer review queues. E2E tests are still flaky. Compliance sign-off still takes a week. In some organisations, end-to-end delivery time has actually gotten worse since AI adoption. The bottleneck didn’t disappear. It moved.

Three things worth changing now

Stop estimating coding effort
Story points were always a proxy for something real. The question is whether they’re a proxy for the right thing. When coding is cheap, estimating coding effort means measuring the constraint that no longer exists. What matters now is how long work sits waiting: in review, in a test queue, in someone’s inbox. Flow time tells you that. Story points don’t.
Track waiting, not working
Standups evolved to surface whether people were blocked on their code. In an AI-assisted team, people are rarely blocked on their code. That part moves fast. The thing worth surfacing is what’s queued and stuck. Flaky E2E suites. An overloaded reviewer. An environment that isn’t ready. That’s where the time goes. A standup that doesn’t ask about any of that isn’t reflecting how work actually moves.
Write contracts, not estimates
Backlog refinement assumed ambiguity about implementation: how long would this take, how complex is it. AI removes most of that. What AI can’t do is infer requirements that weren’t written down. A vague ticket produces vague output. Refinement should be about defining what done looks like: what has to be true when this ships, what a reviewer needs to check, what can’t break. That’s a different conversation than story pointing.
The actual problem

Agile optimised for keeping developers productive and shipping on a cadence. Both were sensible goals when coding was slow. Neither addresses the actual constraint anymore.
The question isn’t “are developers busy?” Everyone is busy. The question is whether work is actually moving from refinement to review to test to release. Those sound like the same question. They’re not.
Teams that invest in their delivery system (pipelines, review processes, testing infrastructure) will ship faster with AI than they did without it. Teams that bolt AI onto an unchanged process will produce more code and pile it into the same bottlenecks they’ve always had.
A lot of teams are about to find out which one they are.
Agile was built for a world where writing code was the bottleneck. Some people look at AI coding tools and conclude that the bottleneck is now review and testing, so the fix is to change how you review and test. That’s the right observation with the wrong conclusion.
The bottleneck is moving, yes. But the reason review queues are longer isn’t that developers are more productive. It’s that AI-generated code requires more careful review, not less. Copilot and similar tools produce code with higher defect rates, security issues, and subtle logic errors that look right at a glance. The review queue got longer because the code got worse, not because the process is miscalibrated.

The changes being proposed, examined

Flow time isn’t the answer either
The argument for replacing story points with flow metrics is appealing on the surface. But flow time is a lagging indicator that tells you where you’re stuck after you’re already stuck. Story points, for all their flaws, force a conversation about complexity before work starts, a conversation that surfaces assumptions and risks early.
The problem with Agile estimation isn’t that it measures coding effort. The problem is that teams treat estimates as commitments and velocity as a performance metric. That’s a management problem, not a methodology problem. Swapping story points for cycle time doesn’t fix the underlying issue.
Standups aren’t the bottleneck
Standups take 15 minutes. If your standup is surfacing the wrong information, the fix is to ask better questions, not eliminate synchronous coordination. Teams working on complex problems tend to need more coordination when things get ambiguous, not less. Going fully async trades short-term efficiency for coordination costs that don’t show up until something goes wrong.
Specification quality matters, but “contracts not estimates” misframes the problem
The emphasis on detailed acceptance criteria and behavioural specifications is genuinely good advice, largely independent of AI. Vague tickets produce unpredictable output whether a human or a model writes the code. But calling this “write contracts, not estimates” reframes good requirements engineering as an AI adaptation, when really teams should have been doing this anyway.
What’s actually happening

AI has increased the volume of code being produced without a proportional increase in code quality. Review and testing bottlenecks aren’t process failures. They’re a quality control system doing its job under higher load.
The process changes being proposed (kill story points, go async, write specs) are reasonable engineering hygiene. They’re not a response to AI. They’re improvements that were worth making before AI and remain worth making now.
What would actually help: better automated verification, stronger linting and static analysis, clearer code ownership, and honest conversations about what AI-generated code costs to maintain. The operating model doesn’t need to evolve so much as the quality bar needs to hold.
Sprints, standups, and story points are easy targets. The harder question is whether the code being shipped is good enough to justify skipping the scrutiny.
Agile was designed around a scarcity that no longer exists. Writing code was slow, expensive, and required rare specialists, so every practice in the playbook was structured to protect developer time. Sprints exist to give developers uninterrupted focus. Standups exist to unblock developers. Story points exist to estimate developer effort. The developer was the bottleneck. That assumption is gone.
A senior engineer with the right tools can do in an afternoon what previously took a sprint. At that rate, the question is whether the sprint is protecting anything worth protecting.

What has to change, completely

Kill story points
Story points were always a workaround for the fact that software estimation is hard. They don’t measure value, they don’t measure risk, they don’t measure customer impact. They measure how long a developer thinks the coding will take. When coding is no longer the constraint, story points are measuring nothing useful. Teams that keep estimating coding complexity are optimising for a bottleneck that no longer exists.
The replacement isn’t a better estimation scheme. It’s treating software delivery as a flow problem. Cycle time, lead time, queue depth. How long does work sit before someone looks at it? How long does it take to get from “done coding” to “in production”? That’s what limits you now.
Async everything, including standups
The daily standup made sense when coordination required synchronous communication and blocking issues needed immediate escalation. An AI-assisted team doesn’t have those problems. Code generation is asynchronous. Reviews can be asynchronous. The reason to synchronise is when humans need to make judgment calls together, and that’s a different meeting than a daily status check.
The standup is becoming a relic. What replaces it is a flow board and a clear escalation path for decisions that can’t be made independently. Some teams will resist this because synchronous meetings feel productive. They aren’t.
Specs, not tickets
The future of backlog refinement is writing specs, not sizing work. AI can implement from a well-written spec with minimal handholding. What it can’t do is infer requirements that weren’t written down. The constraint shifts from implementation capacity to specification quality, which means the most valuable thing a team can do in refinement is write a document that makes acceptance criteria unambiguous, not estimate how many points the coding will take.
The end of sprint theatre

The teams that make the most of this shift are willing to question the sprint itself. Not just refine it, but ask whether two-week cycles make sense when implementation takes hours, whether ceremonies designed around developer throughput make sense when the real constraint is review queues and pipelines. Most teams won’t go there. The ceremonies are comfortable, and changing process is hard. They will bolt AI onto existing Agile and wonder why delivery didn’t speed up proportionally.
The process is the bottleneck now. That’s a solvable problem, but only if you’re willing to treat it as one.