93% of developers use AI coding tools — but productivity gains have plateaued at 10%. What does the data actually tell us?
This isn't early-adopter territory anymore. Based on data from 121,000 developers across 450+ companies, AI coding assistants have reached near-universal adoption.
Despite near-universal adoption, productivity gains have flatlined. The initial bump when AI tools arrived hasn't grown — it's held steady for over a year.
Self-reported by developers who save at least 1 hour/week using AI coding assistants. The biggest wins come from understanding and maintaining existing code — not writing new code from scratch.
According to AWS survey data, the average developer spends only 20% of their time actually writing code. AI coding assistants are laser-focused on that sliver — leaving 80% of the workday untouched.
This is where AI coding assistants operate today
How fast we can write code has never been the bottleneck. The real time sinks are everything around it — and that's where the next wave of AI impact needs to land.
The theory: AI reduces the "context-switching tax" by bringing answers into the IDE instead of sending devs to docs, Stack Overflow, or Slack. The reality is more complicated.
The takeaway: AI has the potential to reduce context-switching — but only if the tool fits seamlessly into the workflow. Poorly integrated AI creates a new type of interruption: the prompt-review-fix loop.
While productivity is flat, the share of AI-authored code that ships to production keeps climbing. Data from 4.2 million developers (Nov 2025 – Feb 2026).
"AI-authored code" = code merged into main/production with little to no human modification. Daily users now have nearly one-third of their shipped code written by AI.
The data reveals a clear disconnect: adoption has reached saturation levels, but measurable productivity improvements haven't kept pace.
More code is AI-generated than ever, but the overall productivity needle hasn't moved.
AI's clearest measurable impact isn't raw coding speed — it's how fast new developers become productive. Time to 10th PR (a standard onboarding metric) has been halved.
AI is particularly powerful for getting people up to speed — whether it's new hires, engineers switching projects, or non-engineers stepping into technical workflows.
Data from 67,000 developers shows AI acts as an amplifier — it makes good orgs better and struggling orgs worse. There is no "typical" experience.
The hype made it sound like just trying AI would automatically pay off. But so far, most tools have been used for individual coding tasks. To see real impact, we need to use AI at the organizational level, not just for single tasks.
Transformation is uncomfortable. Organizations that were ready to quit their cloud or agile transformations are now giving up on AI transformation, too.
The Faros AI evidence: Across 10,000+ developers and 1,255 teams, high-AI teams merged 98% more PRs — but PR review time increased 91%. The bottleneck isn't code generation. It's everything downstream: review queues, testing, and release pipelines that can't absorb the new velocity.
Adoption alone doesn't guarantee results. Just using the tools doesn't automatically improve an organization.
Google's DORA research program — the industry standard for measuring software delivery — studied ~5,000 professionals across two years and found a consistent pattern: AI improves speed but increases instability.
AI enables developers to produce larger changesets faster. Bigger batches = higher risk of failure — something DORA has warned about for a decade. Teams also report over-relying on AI during code review, which speeds reviews but misses defects. The "fail fast, fix fast" hypothesis was tested — the data doesn't support it. Instability still harms product performance and increases burnout.
GitClear analyzed 211 million changed lines of code across five years (2020–2024). As AI adoption rose, several quality indicators moved in the wrong direction.
The pattern: AI makes it easy to add code, but the harder work — refactoring, restructuring, paying down tech debt — is declining. More code is being written, revised quickly, and duplicated. This aligns with the DORA finding that AI increases throughput but decreases stability.
OpenAI's Codex desktop app has emerged as one of the fastest-growing AI developer tools, with notable enterprise adoption.
These are vendor-reported metrics from OpenAI, not independently verified. The "60% more PRs" stat comes from internal usage — the same environment where 95% adoption may reflect organizational expectation rather than organic choice. Compare with the METR RCT (independent, randomized) which found experienced devs were 19% slower with AI tools.
Three players now hold 70%+ of the ~$4B market. All have crossed $1B ARR. But the lines are blurring — GitHub now lets you run Claude and Codex inside Copilot.
Copilot, Cursor, and Claude Code all have agent modes. The model is often the same (you can run Claude inside all three). The real differences come down to where the agent lives, how deep it sees, and what it can touch.
Copilot added agentic features onto an autocomplete foundation. Claude Code was built as an agent — terminal access, filesystem control, and self-correcting test loops are the core, not a bolt-on. The difference is felt most on complex, multi-file tasks: refactors across 50 files, migrations, and architectural changes where full repo awareness matters. For inline completions and quick edits, Copilot is often faster and more seamless.
The practical answer: Most teams are converging on using both — Copilot for inline speed and GitHub integration, Claude Code for heavy-lift agentic tasks. The era of picking one tool is ending.
The data challenges the assumption that senior engineers should get priority. Junior and newer developers consistently show the largest measurable productivity gains from AI tools.
Onboarding new devs, unfamiliar codebases, boilerplate, scaffolding, test generation, docs
Refactoring, stack trace analysis, code review assistance, migrations
Complex architecture in familiar codebases, deep system design, senior devs who already know the answer
Bottom line: Invest heavily in AI for onboarding and junior devs — the ROI is proven and compounds. For senior engineers, don't force adoption — instead, remove barriers and let them self-select into high-leverage use cases.
Organizations winning with AI already had strong developer experience fundamentals in place. AI amplifies what's already there — for good or ill.
Quick feedback loops let AI-generated code get validated faster
AI tools perform significantly better with well-documented codebases
Clear boundaries make AI-assisted changes safer and more predictable
Good documentation, robust observability, fast test execution, frictionless local dev experience — these aren't just nice-to-haves in the world of AI-assisted engineering. They're essential.
According to the DX AI Measurement Framework, you need to track three dimensions — not just adoption.
How widely are AI tools adopted? Daily/weekly usage, tool penetration across teams.
How does AI change performance? Time savings, PR throughput, developer experience, quality.
What's the ROI? Tool spend vs measurable gains. Identify high-leverage use cases worth scaling.
Over-indexing on one dimension (e.g., adoption rates) gives a false picture. Combine direct metrics (time saved) with indirect metrics (throughput, quality, developer satisfaction) for the full story.
If AI coding assistants plateau at ~10% gains with unstructured prompting, the next question is obvious: what happens when you give them better instructions? Spec-driven development (SDD) treats the specification — not the code — as the primary artifact. The AI generates implementation from structured, version-controlled specs.
Specifications guide code generation. Developer writes a detailed spec, AI implements it. The most common level today — used by GitHub Spec Kit and Kiro.
Specs both constrain and validate AI output. Automated checks verify generated code against the spec. Closer to contract-driven or BDD-style workflows.
The specification replaces code as the maintained artifact. Code becomes disposable output. The radical end of the spectrum — explored by Tessl (private beta).
Spec-driven development may not have the visibility of a term like vibe coding, but it's nevertheless one of the most important practices to emerge in 2025.
The SDD tooling ecosystem has consolidated around four open-source approaches, each optimized for a different context. AWS Kiro adds a commercial fifth option.
Small teams (2–10): Spec Kit or OpenSpec for simplicity. Medium teams (10–50): Kiro or Spec Kit for collaboration. Large orgs (50+): BMAD or Kiro for governance. The era of picking one tool is ending — 29 of 99 surveyed developers already use multiple AI tools simultaneously.
Spec-driven development addresses the right problem: unstructured prompting produces inconsistent results. But the space has a significant gap between practitioner enthusiasm and rigorous measurement.
The most active debate in the SDD community is whether this is genuinely new or just waterfall with AI characteristics. The Spec Kit model (lightweight, iterative markdown) is designed to avoid this trap. BMAD's 21-agent approach risks it. The answer likely depends on team discipline, not the framework — the same lesson the agile movement learned two decades ago.
Watch this space: SDD is early-2026's most promising methodology shift. But apply the same skepticism we applied to AI productivity claims earlier in this deck — demand measured outcomes, not just adoption numbers.
Economist Erik Brynjolfsson argues that AI is following the pattern of every general-purpose technology: an investment phase where productivity appears flat, followed by a harvest phase where gains accelerate.
We are transitioning from an era of AI experimentation to one of structural utility.
The developer-level data (DX, DORA, METR) shows modest, plateauing gains — while macro data shows acceleration. This gap may reflect what Brynjolfsson calls "a small cohort of power users" driving outsized impact, or it may be that the macro gains are coming from non-coding AI use cases (customer service, content, operations). The 2026 data will be decisive.
Every data point in this deck tells the same story: AI's gains in existing codebases are real but modest (~10%), plateau fast, and come with stability risk. That's not where the business advantage lives.
AI doesn't make your existing product 10× better. It makes building the wrong thing 10× cheaper. That changes the economics of experimentation — and that's where established companies have been losing to startups for decades.
The DORA data is clear: AI destabilises mature systems. But instability is a feature, not a bug, when you're exploring new territory. Point AI where failure is cheap and learning is expensive.
Don't pour AI into making your existing cash cow 10% faster. The DORA data shows that's where instability hurts most and gains plateau fastest. Protect your core with engineering discipline. Unleash AI on the edges — where you're exploring, not maintaining.
If AI collapses the cost of building and testing software (Slides 20–21), it doesn't just affect how developers work — it reshapes which business models survive. The first casualties are businesses whose value proposition is connecting rather than creating.
The companies that win won't be the ones who used AI to build a better marketplace. They'll be the ones who used AI to figure out the marketplace was unnecessary — and built the thing that replaces it.
The data is clear: applying AI to mature codebases yields modest, plateauing gains with real stability costs. The transformative play is using AI to collapse the cost of experimentation — to find what delights your customers before your competitors even finish scoping.
Don't use AI to build faster. Use it to learn faster.
Primary data: DX Research — "Measuring Developer Productivity & AI Impact" (Feb 2026)
Talk: Laura Tacho keynote at The Pragmatic Summit, Feb 11, 2026
Additional: Google DORA 2024/2025, METR RCT (2025), Faros AI, GitClear (2025), Brynjolfsson/Stanford (2026)
Spec-driven development: Thoughtworks Technology Radar, Piskala (arXiv 2026), GitHub Spec Kit, BMAD, OpenSpec, AWS Kiro
Multi-company RCT: Cui et al. — "Effects of Generative AI on High-Skilled Work" (4,867 devs)
Full video: youtube.com/watch?v=LOHgRw43fFk