Chapter 05

Most AI-Native Developers are Level 2. * Your Tools are Fast, but your Thinking has not Changed!

84% of developers use AI coding tools. A controlled study found them 19% slower. The AI Developer Maturity Framework maps 5 levels of developer maturity, from autocomplete user to Orchestrator, and exposes why the vast majority is stuck at Level 2.

Peter van Hees

30 Mar 2026 • 8 min read

84% of developers now use AI coding tools ^[1]. A controlled study of 16 experienced developers across 246 tasks found they completed work 19% slower with AI assistance ^[2]. Those same developers believed they were 20% faster ^[2]. Read that again. The tools are everywhere. The illusion of mastery is total.

The AI Developer Maturity Framework is a five-level taxonomy that maps your relationship to AI, not your tool access. This article maps each level, diagnoses why the vast majority of the industry is stuck at Level 2, and gives you the criteria to identify your current level and the specific shift required to reach the next one.

The Level 2 Trap

The industry treats AI coding tool adoption as the finish line. Install Cursor. Subscribe to Copilot. Autocomplete your way to productivity. 84% of developers have crossed that line. Almost none have kept running.

Google's DORA 2024 research found that every 25% increase in AI adoption correlated with a 1,5% delivery speed dip and a 7,2% drop in system stability ^[3]. AI-coauthored pull requests exhibit 1,7 times more issues than human-only pull requests ^[1]. The pattern repeats across every study: teams adopt AI tools without changing how they think about software, and the tools make them worse.

I call this the Level 2 Trap. You replaced the keyboard with a chat window. You prompt feature by feature, accept suggestions line by line, and ship what compiles. Your workflow is identical to what it was before, except now a large language model generates the syntax you used to type. You upgraded your instrument without upgrading your process.

In AI Agents: They Act, You Orchestrate, I describe this exact phenomenon through the lens of the Delegation Ladder: a four-stage model for assigning work to AI agents with increasing precision and autonomy. Prompt engineering without context engineering is a cargo cult. You are building bamboo runways and waiting for cargo that will never arrive.

The METR study is the empirical verdict on that cargo cult. Experienced developers, working on codebases averaging over one million lines of code, got slower with AI ^[2]. They accepted less than 44% of AI-generated suggestions ^[2]. The tools did not fail. The developers failed to change their relationship to the tools.

The AI Developer Maturity Framework: 5 Levels

Every developer at every level uses the same stack. The differentiator is the maturity of your delegation.

Level 1: The Prompter

The Prompter uses AI as autocomplete. Tab-complete a function, accept a suggestion, move on. The AI is a faster keyboard. There is no planning, no specification, no context architecture. This is where every developer starts, and it delivers real value for small, isolated tasks: boilerplate generation, simple CRUD operations, syntax recall.

The Prompter's ceiling is speed on trivial work. The moment complexity rises, outputs degrade because the AI has no context beyond the current file and no constraints beyond the current line.

Level 2: The Planner

The Planner delegates feature by feature. "Build me a login page." "Write a function that parses this JSON." "Debug this error." The AI generates larger blocks of code. The developer reviews, adjusts, and integrates.

This is where the industry lives. 51% of professional developers use AI daily ^[1]. They prompt, review, and ship. They believe this makes them AI-native. It does not!

The Planner's fatal flaw is what the framework calls contextual myopia: each prompt exists in isolation, stripped of system-level context, architectural constraints, and acceptance criteria. The AI cannot see the whole; it sees the prompt. The Planner is stuck at the Describe stage of the Delegation Ladder: stating desired outcomes in loose, conversational language and forcing the model to fill enormous logical gaps with statistical guesswork.

Here is the diagnostic question: when was the last time you wrote a specification document before prompting? If the answer is never, you are a Level 2 Planner.

Level 3: The Interrogator

The Interrogator is the hard transition. You stop telling the AI what to build and start asking it to challenge your assumptions: "What edge cases am I missing? What are the failure modes of this architecture? Where does this design break under load?"

This requires an ego inversion. The Planner treats AI as a subordinate that executes commands. The Interrogator treats AI as a peer that stress-tests thinking. The shift maps to the Specify stage of the Delegation Ladder: you graduate from describing a goal to specifying its constraints, its validation criteria, and its failure modes.

In the AI Developer Maturity Framework, the bottleneck has moved from syntax to judgment. A senior engineer with 15+ years of experience building iOS apps found that the AI handled SwiftUI layouts, CRUD operations, and debugging at 10 times the speed of manual work. Where the AI failed: Apple compliance, async race conditions, and architectural decisions. The AI multiplied 15+ years of judgment. It did not replace it. Level 3 developers recognize this and deploy AI to amplify their judgment, not to bypass it.

Level 4: The Architect

The Architect writes specifications first and code never. The developer produces system-level specifications, acceptance criteria, and architectural constraints before any AI agent touches the codebase. The AI operates within a fully defined sandbox.

This maps to the Validate stage of the Delegation Ladder. I call the key tool the Acceptance Criteria Contract: a four-field document comprising an Objective, Constraints, a Validation Method, and an Escalation Protocol. The Architect writes these contracts. The AI executes them.

The Level 3-to-Level 4 distinction is the direction of initiative. The Interrogator still drives the process manually, prompting and challenging in real time. The Architect encodes intent into specifications that agents execute autonomously. The human artifact is the specification. The code is a byproduct.

Level 5: The Orchestrator

The Orchestrator operates what the industry now calls a Dark Factory: a software production system where no human writes code, no human reviews code, and the only human artifacts are specification documents.

This is not theoretical. StrongDM's AI team, founded on 14 July 2025 by three engineers, operates under two rules: code must not be written by humans, and code must not be reviewed by humans ^[4]. Their repository contains exactly three Markdown specification files and zero lines of human-written code ^[4]. Their compute benchmark: $ 1.000 in tokens per engineer per day ^[4].

The Orchestrator maps to the Elevate stage of the Delegation Ladder: you stop handing your agent instructions and start giving it outcomes.

This is the Functional Dissolution Principle applied to software development: when agents execute all routine tasks, the role does not disappear; it collapses into specification writer and agent Orchestrator. The traditional developer role dissolves.

Level 5 carries real risks. Stanford Law's CodeX project has flagged the legal liability void: no existing framework covers code written and reviewed entirely by agents ^[5]. Institutional knowledge degrades when no human reads the codebase. Circular validation, where agents test their own outputs, creates a verification crisis. The framework maps the territory so you choose the right level for the right problem.

The Economic Chasm Between Maturity Levels

The gap between Level 2 and Level 5 is an economic chasm, not a productivity gradient. AI-native startups in the top 10 average $ 3.000.000 to $ 3.500.000 in revenue per employee ^[6]. Cursor generates $ 3.300.000 in annual recurring revenue per employee ^[7]. Traditional SaaS companies generate $ 300.000 to $ 600.000 ^[6]. That is a five to ten times multiplier, and it concentrates at the top of the maturity framework.

The labor market reflects this concentration. Junior developer job postings in the US have dropped 67% ^[8]. Junior employment fell 9-10% within six quarters of AI implementation ^[6]. Tech graduate roles declined 46% in 2024 ^[8].

The syntax-generation work that Level 2 developers delegate to AI is the same work that defined junior roles. As that work gets commodified into Synthetic Labor, the category of economically replaceable execution that agents perform at a fraction of human cost, economic value concentrates at the top of the Human Premium Stack: architectural judgment, system design, intent specification. I mapped the full scope of this AI job displacement in my book.

You face a binary window. The developers climbing to Level 3 and beyond are capturing the economic premium. The developers performing Level 2 rituals are competing for a shrinking share of commodified output.

Vibe Coding is the Starting Line

Andrej Karpathy coined vibe coding in February 2025: the intuitive, low-specification use of AI to generate code ^[9]. The term went viral. The industry adopted it as an identity.

That is the problem. Vibe coding is Level 1 and Level 2 behavior dressed up as a movement. Addy Osmani, Engineering Manager at Google Chrome, drew the line clearly: "You shouldn't blindly trust an AI's code without oversight, just as you wouldn't let a first-year junior dev architect your entire system unsupervised" ^[10].

I draw a harder line. AI-Native Development is the professional discipline that begins at Level 3, where specification replaces prompting and context engineering replaces prompt engineering. Vibe coding is the prompt-first mindset where you describe and the AI generates. AI-Native Development is the system-first architecture where agents are first-class citizens and the human artifact is always a specification.

The term you use reveals your level. If you call what you do vibe coding, you are advertising Level 2 maturity. The serious work starts when you stop vibing and start specifying.

The Ancient Discipline

Here is the reframe you did not expect: this framework has nothing to do with AI.

The five-level progression, from vague description to precise specification to autonomous execution, is the same progression that has always separated effective leaders from ineffective ones. The Delegation Ladder predates AI tools. What AI has done is make the cost of poor delegation instantaneous and measurable. The METR study did not discover an AI problem. It measured a delegation problem at machine speed.

I developed the AI Developer Maturity Framework to make an ancient discipline legible: the tool is new, the skill is old, the stakes unprecedented.

The question is no longer whether you use AI to write code. Every developer does. The question is whether you have upgraded the way you think about code. Your tools operate at Level 5 speed. Your process determines whether that speed builds or destroys. Diagnose your level. Name the gap. Climb the framework, or be priced out by those who do.

This article maps one framework from AI Agents: They Act, You Orchestrate by Peter van Hees. The book's 18 chapters span the complete architecture of the Agent-First Era, from the Delegation Ladder and Acceptance Criteria Contract that underpin this maturity model, to the Human Premium Stack that maps what survives when Synthetic Labor commodifies your code. If the Level 2 Trap resonated, the book gives you the complete operating system for climbing out. Get your copy:

🇺🇸 Amazon.com 🇬🇧 Amazon.co.uk 🇫🇷 Amazon.fr 🇩🇪 Amazon.de 🇳🇱 Amazon.nl 🇧🇪 Amazon.com.be

References

^[1] Stack Overflow / Panto, "AI Coding Assistant Statistics," 2025. https://www.getpanto.ai/blog/ai-coding-assistant-statistics
^[2] METR, "Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity," July 2025. https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/
^[3] Google DORA Research Program, "DORA Report 2024," 2024. https://dora.dev/research/2024/
^[4] Simon Willison, "How StrongDM's AI team build serious software without even looking at the code," February 2026. https://simonwillison.net/2026/Feb/7/software-factory/
^[5] Eran Kahana, "Built by Agents, Tested by Agents, Trusted by Whom?" Stanford Law CodeX, February 2026. https://law.stanford.edu/2026/02/08/built-by-agents-tested-by-agents-trusted-by-whom/
^[6] Andrew Baker, "Dark Factories: AI Is Splitting Software Teams Apart," March 2026. https://andrewbaker.ninja/2026/03/22/the-dark-factory-why-most-teams-are-getting-slower-while-a-few-are-building-software-without-any-humans/
^[7] Tomasz Tunguz / Sacra, "The Communication Tax of Small Orgs," 2026. https://tomtunguz.com/communication-tax-small-orgs/
^[8] Rezi, "The Crisis of Entry-Level Labor in the Age of AI (2024-2026)," January 2026. https://www.rezi.ai/posts/entry-level-jobs-and-ai-2026-report
^[9] Andrej Karpathy, X post on vibe coding, February 2025. https://x.com/karpathy/status/1886192184808149383
^[10] Addy Osmani, "Vibe coding is not the same as AI-Assisted engineering," Medium, 2025. https://medium.com/@addyosmani/vibe-coding-is-not-the-same-as-ai-assisted-engineering-3f81088d5b98