Same Tools, Wildly Different Outcomes

Two companies buy the same AI licences, deploy them to the same developer population, and measure the results six months later. One sees a 50% reduction in customer-facing incidents. The other sees twice as many.

This is not a hypothetical. It is what DX Research found across 67,000 developers in their November 2025 to February 2026 dataset. And it is the most consequential finding in AI productivity research that almost nobody is talking about.

AI as amplifier

The instinct when productivity gains disappoint is to reach for the tool explanation. The AI isn’t good enough yet. We picked the wrong platform. We need a different model. These are natural places to look, and sometimes they are right.

But the divergence in outcomes across the 67,000-developer dataset is too large to be explained by tooling differences. The organisations in that dataset were largely using the same tools. The variable was not the product. It was the organisation.

Well-structured organisations saw AI act as a force multiplier: faster delivery, higher quality, fewer incidents. Struggling organisations saw the opposite. AI exposed existing weaknesses and accelerated them: more instability, more incidents, worse quality.

AI does not fix a bad development process. It accelerates whatever process you have.

The Faros AI evidence

Faros AI published data covering 10,000+ developers across 1,255 teams. High-AI teams merged 98% more PRs per day than low-AI teams. That sounds like a productivity breakthrough until you read the next number: PR review time increased 91%.

The bottleneck did not move. Code production sped up substantially. The capacity of the downstream pipeline, review queues, testing infrastructure, release processes, stayed the same. Code piled up. Review times stretched. The system as a whole did not get faster; it got more congested.

This is a management problem. Not in the pejorative sense of blaming managers, but in the precise sense: it is a problem of system design, workflow, and organisational capacity. A better AI model will not solve it.

The adoption trap

Laura Tacho, CTO at DX, framed it directly: “The hype made it sound like just trying AI would automatically pay off. But so far, most tools have been used for individual coding tasks. To see real impact, we need to use AI at the organisational level, not just for single tasks.”

This is the adoption trap. Distributing tool licences is easy to measure. Adoption metrics look good. But individual-level productivity gains that hit a bottleneck at the team or pipeline level do not produce organisational outcomes. They produce congestion.

The organisations seeing positive results in the DX data share common characteristics: clear goals for AI use, measurement practices that track impact rather than just adoption, and engineering fundamentals that let them absorb increased throughput. Fast CI pipelines. Good documentation. Well-defined service boundaries. These are the enablers that determine whether AI-generated code flows through to production safely, or backs up in review queues and incident queues.

What this means for leadership

If you are in an engineering leadership role and your AI adoption programme has not moved the organisational metrics, delivery frequency, incident rate, developer satisfaction, adoption numbers are not the problem to investigate.

Three questions worth asking:

First: is review capacity keeping up with generation capacity? If developers are producing code 30% faster but reviewers are not reviewing 30% faster, you have created a new bottleneck. Either invest in review tooling and process, or consciously decide that throughput is not the metric you are optimising for.

Second: what does AI do to your weakest links? The organisations seeing 2x more incidents in the DX data did not have good processes that AI disrupted. They had fragile processes that AI accelerated into failure. Where are your fragile processes? AI will find them before you do.

Third: are you measuring adoption or impact? Adoption tells you how many people are using the tool. Impact tells you whether outcomes are changing. The DX AI Measurement Framework recommends tracking utilisation, impact, and cost as three separate dimensions. Collapsing them into a single adoption percentage hides the information you need.

The tool is not the strategy. Deciding what problem you are actually trying to solve, measuring whether you are solving it, and building the organisational capacity to absorb the changes AI creates: that is the strategy. The tool is just the accelerant.

Sources: DX Research, 67,000 developers, Nov 2025 to Feb 2026. Faros AI, 10,000+ developers, 1,255 teams (2026). Laura Tacho keynote, The Pragmatic Summit, Feb 2026.

This finding from DX Research’s 67,000-developer dataset gets framed as an exciting amplifier story. I think it is a warning.

The amplifier finding is damning, not exciting

The optimistic reading goes like this: AI multiplies your effectiveness, so if you have good processes, you get great outcomes. True enough. But flip the lens and ask a harder question: what is the base rate of “good processes” in the software industry?

It is low. Most engineering organisations do not have fast CI pipelines, clean service boundaries, and mature measurement practices. Most are somewhere on the spectrum between “getting by” and “struggling.” The DX data confirms this implicitly. If the majority of organisations had strong fundamentals, the dataset would skew toward positive outcomes. It does not. The variance is the story, and the variance tells us that the typical organisation is more likely to see the bad outcome than the good one.

The expected value of AI adoption across the actual population of engineering teams is negative, or at best flat. The 50% incident reduction is the exception that gets cited at conferences. The instability and congestion accumulate quietly everywhere else.

The Faros data shows the typical outcome

Faros AI’s numbers are striking: high-AI teams merged 98% more PRs per day. Review times increased 91%. This is not a cherry-picked failure case. This is the aggregate data from 10,000+ developers across 1,255 teams. This is what normal looks like when you add AI to a normal engineering organisation.

More code flowing into the same review capacity means longer queues, rushed reviews, and more defects reaching production. The mathematics of this are straightforward. Unless you have already invested in expanding downstream capacity, and most organisations have not, the additional throughput creates pressure that degrades quality.

The optimists say “just fix the bottleneck.” But if these organisations could easily fix their review processes, testing infrastructure, and release pipelines, they would have done it before AI arrived. These are hard organisational problems. They require hiring, tooling investment, and process change that takes quarters, not weeks. Meanwhile, the AI-generated code is piling up now.

”Fix your processes first” is not actionable advice

The standard response to this data is: get your engineering house in order, then let AI amplify it. This sounds reasonable and is almost entirely unhelpful.

The organisations that most need AI productivity gains are the ones with the least capacity to do the prerequisite organisational work. They are understaffed, under-resourced, and shipping under pressure. Telling them to first invest in CI infrastructure, documentation, and measurement practices before they can safely adopt AI is telling them to do the thing they could not do before AI, as a prerequisite for using AI.

And the organisations that already have strong engineering fundamentals? They are the ones that least need the productivity boost. They are already performing well. AI makes them better, but they were not the ones in crisis.

This creates a troubling dynamic where AI widens the gap between well-run and poorly-run engineering organisations, and the poorly-run ones, which make up the majority of the industry, get worse.

The measurement gap makes this harder

The DX data tells us to measure impact, not just adoption. Good advice. But measuring engineering impact well requires exactly the kind of mature measurement infrastructure that struggling organisations lack.

You need to track delivery frequency, incident rates, developer satisfaction, and costs across a long enough time horizon to separate signal from noise. You need leadership that understands these metrics and is willing to act on them. These are not trivial requirements.

The organisations that can measure AI’s impact well are the same ones that are already getting good results. The organisations getting bad results are the least equipped to detect and respond to the deterioration. By the time they realise AI adoption is increasing their incident rate, the damage is already compounding.

I am not arguing against AI adoption. But the current narrative that “AI is an amplifier, so just have good processes” severely understates the difficulty and understates the risk for the majority of engineering organisations that do not, and cannot quickly, meet that prerequisite.

The tool is just the accelerant. That is true. But if most of what you are accelerating is dysfunction, the accelerant is not helping.

Sources: DX Research, 67,000 developers, Nov 2025 to Feb 2026. Faros AI, 10,000+ developers, 1,255 teams (2026). Laura Tacho keynote, The Pragmatic Summit, Feb 2026.

This is not a hypothetical. It is what DX Research found across 67,000 developers. And it is, without exaggeration, the most exciting result in the AI productivity literature to date.

The amplifier finding changes everything

Most of the conversation around AI and developer productivity has been stuck on averages. “AI makes developers 10% more productive.” “AI doesn’t move the needle.” These statements are both true and both useless, because they average over a distribution where the variance is the whole story.

The 67,000-developer dataset shows the variance clearly. Well-structured organisations saw AI cut customer-facing incidents by 50%. Struggling organisations saw incidents double. Same tools. Same time period. Radically different results.

That variance is not a problem. It is the opportunity. If the gap between the best and worst outcomes is that wide, it means the upside for organisations that get this right is enormous. We are not looking at 10% productivity gains. We are looking at transformation-level impact for the teams that earn it.

Fix the process, then let AI multiply the fix

The message here for engineering leaders is not caution. It is prioritisation.

The organisations in the DX data that saw the 50% incident reduction were not doing anything exotic. They had clear service boundaries, fast CI pipelines, good documentation, and measurement practices that tracked outcomes rather than just adoption. Standard engineering fundamentals. The kind of work that gets deprioritised quarter after quarter because it is not flashy.

AI changes that calculation. Every improvement to your engineering fundamentals now gets multiplied by whatever AI acceleration factor your teams are seeing. If you fix a flaky test suite and your developers are also using AI to ship 30% more code, you get the compound benefit: more code, flowing through a more reliable pipeline, reaching production with fewer defects.

I keep coming back to this framing: AI does not add a fixed productivity bonus. It multiplies your existing effectiveness. That makes every investment in process improvement worth more than it was a year ago.

The congestion problem is solvable

Faros AI published data showing high-AI teams merged 98% more PRs per day but saw review times increase 91%. This gets cited as evidence that AI creates unsustainable pressure. I see it differently.

Review congestion is a systems engineering problem with known solutions. Automated review tooling, better PR decomposition, async review practices, investment in CI that catches issues before human review, these are all tractable. The organisations in the DX data that got positive results had already solved these problems or were solving them in parallel with AI rollout.

The 91% review time increase is what happens when you increase supply without increasing downstream capacity. It is the same problem that manufacturing solved decades ago with theory of constraints. You identify the bottleneck, you invest in it, you rebalance the system. This is not novel and it is not a reason to hold back on AI adoption.

The real risk is waiting

The organisations seeing 50% incident reduction are building compound advantages right now. Every month they operate with AI-amplified good processes, they ship more, learn more, and widen the gap. The organisations that are “waiting for AI to mature” or “proceeding cautiously” are not standing still. They are falling behind, because their competitors are using the same tools to better effect.

The DX data is a clear signal: the returns to engineering excellence just got much larger. The right response is not to slow down AI adoption. It is to accelerate the organisational work that makes AI adoption effective.

If your engineering fundamentals are solid, turn the dial up. If they are not, fix them now, because AI is about to make the cost of not fixing them much higher.

Sources: DX Research, 67,000 developers, Nov 2025 to Feb 2026. Faros AI, 10,000+ developers, 1,255 teams (2026). Laura Tacho keynote, The Pragmatic Summit, Feb 2026.