Research · Blogs & Independent Thinkers
Back to sweepResearch sweep · deep · 2025 – 2026
Engineering AI Control Plane
Engineering AI control planes for software delivery from July 1, 2025 through April 24, 2026: how teams implement AI across development workflows and CI/CD, choose tools/models/SDKs, govern observability and compliance, manage reliability and provider availability, and handle cognitive debt, dark code, case studies, success stories, and failure modes across team size, company scale, and greenfield versus brownfield systems
- financial
- frontier
- academic
- vc
- blogs
- tech
Synthesised 2026-04-24
Narrative
Between July 2025 and April 2026, the independent blog and Substack ecosystem produced the most prescient and evidence-backed analysis of AI-driven software delivery, organized around three interlocking tensions: code velocity vs. verification bottlenecks, greenfield ease vs. brownfield failure modes, and adoption enthusiasm vs. evidence-based skepticism. Simon Willison (simonwillison.net / simonw.substack.com) was the period's most consequential synthesizer. His October 2025 'Vibe engineering' post coined the term distinguishing responsible professional AI use from irresponsible shortcuts. By December 2025 he declared the core job is delivering code proven to work — shifting the frame from generation to verification. His February 2026 deep-dive into StrongDM named the 'dark factory' operating model (no human writes or reviews code, AI-simulated QA swarms run 24/7). In April 2026, via Lenny's Newsletter, he declared November 2025 the inflection point where coding agents crossed from 'mostly works' to 'actually works,' and named the bottleneck shift from writing to verifying code. Addy Osmani (addyo.substack.com), writing from his Chrome VP Engineering role, provided the complementary practitioner voice: his July 2025 'AI-Native Software Engineer' mapped multi-model rotation workflows; his January 2026 '80% Problem' documented the role inversion from implementer to orchestrator; and his March 2026 'Comprehension Debt' essay — citing an Anthropic study showing a 17% comprehension drop among AI-assisted engineers — became the period's most widely shared independent analysis of AI-induced cognitive risk. Rob Bowley (blog.robbowley.net) supplied the hardest empiricism: his November 2025 analysis of the DX report showed AI won't rescue poor engineering culture; his January 2026 post argued coding was never the bottleneck; and his April 2026 reading of CircleCI's 28-million-workflow dataset showed daily runs up 59% YoY but main-branch success rates at a 5-year low of 70.8% and recovery times up 13% — with only 1 in 20 teams capturing measurable delivery benefit.
Supporting this core trio, Luca Rossi's Refactoring.fm (170,000+ subscribers) framed the moment as 'the era of the software factory' where CI greenness is the scarce resource. Hugo Bowne-Anderson's Vanishing Gradients synthesized 1,365+ production deployments showing that agent sprawl drives teams back toward structured workflows. The General Partnership's brownfield guide and Tom Elliott's Friday Deploy newsletter documented the greenfield/brownfield performance gap from practitioner experience. Oliver Patel's Enterprise AI Governance newsletter provided the governance counterpart — reviewing EU AI Act compliance, agentic risk frameworks, and SOC 2 audit-trail expectations — from his position at AstraZeneca. Ben Thompson's Stratechery pieces on 'Agents Over Bubbles' and 'Microsoft and Software Survival' reframed the strategic stakes: agent harnesses, not model intelligence, are the decisive competitive layer, shaping how engineering leaders should evaluate control-plane build-vs-buy decisions. Across the full body of work, independent bloggers consistently found that AI tools dramatically accelerate code generation while simultaneously exposing every existing weakness in review processes, ownership structures, and integration pipelines — with the most durable adopters being those who invested in verification infrastructure, structured workflows, and governance controls rather than raw coding throughput.
Sources
| ID | Title | Outlet | Date | Significance |
|---|---|---|---|---|
| b1 | Vibe engineering | Simon Willison's Newsletter (Substack) | 2025-10 | Willison coins 'vibe engineering' to distinguish responsible professional AI-assisted development from Karpathy's irresponsible 'vibe coding,' establishing the accountability framework that structured much subsequent practitioner discourse. |
| b2 | Your job is to deliver code you have proven to work | Simon Willison's Weblog | 2025-12 | Willison shifts the engineering frame from code generation to verification, arguing the scarce resource in AI-assisted development is now proven correctness rather than written lines. |
| b3 | How StrongDM's AI team build serious software without even looking at the code | Simon Willison's Weblog | 2026-02 | Documents the 'dark factory' operating model — no human writes or reads code, AI-simulated QA swarms run 24/7 — naming the most radical agentic delivery pattern observed in the wild as of early 2026. |
| b4 | Eight years of wanting, three months of building with AI | Simon Willison's Weblog | 2026-04 | First-person practitioner account of what changed when coding agents became genuinely capable, providing a longitudinal perspective on the November 2025 inflection point from a credible long-track author. |
| b5 | Agentic Engineering Patterns | Simon Willison's Newsletter (Substack) | 2026-03 | Launched Willison's structured pattern library for coding-agent workflows, defining 'agentic engineering' as the professional discipline that emerged from 'vibe engineering' and codifying practices around tool permissions, verification gates, and context management. |
| b6 | An AI state of the union: We've passed the inflection point, dark factories are coming, and automation timelines | Lenny's Newsletter (guest: Simon Willison) | 2026-04 | Willison declares November 2025 the inflection point where coding agents crossed from 'mostly works' to 'actually works,' and names the bottleneck shift from writing to verifying code, reaching a large product-engineering audience. |
| b7 | The AI-Native Software Engineer | Elevate by Addy Osmani (Substack) | 2025-07 | Maps the full AI-native workflow from model selection (trying multiple LLMs in parallel) to iterative prompting and verification, providing the practitioner reference point for tool and model adoption patterns at the start of the period. |
| b8 | The 80% Problem in Agentic Coding | Elevate by Addy Osmani (Substack) | 2026-01 | Documents the role inversion from implementer to orchestrator, showing that AI handles the first 80% of any task easily but the last 20% — which requires judgment, debugging, and integration — still demands senior engineering skill. |
| b9 | Comprehension Debt — the hidden cost of AI generated code | Addy Osmani's Blog (also published on Medium and O'Reilly Radar) | 2026-03 | Introduces 'comprehension debt' — the growing gap between code volume and human understanding — citing an Anthropic study showing a 17% comprehension drop among AI-assisted engineers, making this the period's most widely cited independent analysis of AI-induced cognitive risk. |
| b10 | My LLM coding workflow going into 2026 | Elevate by Addy Osmani (Substack) | 2025-12 | Practical practitioner workflow covering multi-model rotation, context management, prompt structuring, and verification practices — a primary reference for teams designing model-selection and SDK integration policies. |
| b11 | Patterns from over 1,365 AI Production Deployments | Vanishing Gradients by Hugo Bowne-Anderson (Substack) | 2025-12 | Synthesizes 1,365+ real-world LLM deployments showing that high error rates and 'agent sprawl' force teams toward structured workflows rather than autonomous agents, providing the broadest empirical base in the independent blog space. |
| b12 | Stop Building Agents | Vanishing Gradients by Hugo Bowne-Anderson (Substack) | 2025 | Argues that most teams should default to structured AI workflows rather than autonomous agents, based on reliability data from production deployments — a key counter-narrative to the agentic hype cycle. |
| b13 | The Era of the Software Factory | Refactoring by Luca Rossi (Substack, 170,000+ subscribers) | 2026-02 | Frames the post-inflection-point era as 'CI engineering' where code generation is abundant and green CI is the scarce resource, tying together the CircleCI data with an engineering management perspective for a large practitioner audience. |
| b14 | AI Governance in 2025: a year in review | Enterprise AI Governance by Oliver Patel (Substack) | 2026-01 | Provides the most structured independent review of AI governance evolution in 2025, covering EU AI Act compliance, agentic risk frameworks, and the tension between developer autonomy and enterprise auditability, written by AstraZeneca's Enterprise AI Governance Lead. |
| b15 | The Ultimate Agentic AI Governance Resource Guide | Enterprise AI Governance by Oliver Patel (Substack) | 2026-02 | Collects the governance patterns, policy-as-code approaches, and audit trail requirements emerging for agentic AI in engineering workflows, covering SOC 2, ISO 27001, and separation of duties concerns. |
| b16 | A Practical Guide to Brownfield AI Development | The General Partnership (Substack) | 2026-02 | Provides the most actionable independent guide for applying AI agents to legacy codebases, emphasizing agent-readable documentation, architectural decision records, and incremental oversight as brownfield-specific mitigations. |
| b17 | AI can't handle your legacy codebase? This might be why. | The Friday Deploy by Tom Elliott (Substack) | 2025 | Practitioner analysis of AI failure modes in brownfield systems, identifying missing conventions and context-window limits as primary causes and offering CI/CD-centric mitigation patterns. |
| b18 | More code, less delivery — does the CircleCI 2026 Report really show 1 in 20 teams are benefiting? | Rob Bowley's Blog | 2026-04 | The sharpest independent critique of AI delivery productivity data, dissecting CircleCI's 28-million-workflow dataset to show that only 1 in 20 teams capture meaningful delivery benefit and that main-branch success rates hit a 5-year low of 70.8%. |
| b19 | Coding has never been the bottleneck | Rob Bowley's Blog | 2026-01 | Challenges the premise that faster code generation improves delivery, arguing the actual bottlenecks are review, integration, and validation — which AI tools currently worsen rather than help. |
| b20 | Findings from DX's 2025 report: AI won't save you from your engineering culture | Rob Bowley's Blog | 2025-11 | Independent analysis of the DX 2025 developer productivity report showing that AI adoption outcomes correlate strongly with pre-existing engineering culture quality, contradicting vendor claims that tools alone drive gains. |
| b21 | Agents Over Bubbles | Stratechery by Ben Thompson | 2026-03 | Thompson's most explicit analysis of the AI investment thesis in the agentic era, arguing that agent harnesses — not model intelligence — are the decisive competitive layer, directly informing how engineering leaders should evaluate control-plane investments. |
| b22 | Microsoft and Software Survival | Stratechery by Ben Thompson | 2026 | Analyzes how AI agents reshape SaaS software economics, including per-seat licensing viability and the rise of horizontal agent orchestration layers — relevant to engineering platform teams evaluating build vs. buy decisions for AI control planes. |
| b23 | Engineering the Agentic Era: A System Pilot Playbook for 2026 | Intellegen (Substack) | 2026 | Defines the 'system pilot' role — engineer as designer and operator of the agent ecosystem — and specifies MCP-based control plane patterns including real-time audit logs, session monitoring, and enterprise-grade identity for agentic engineering platforms. |
| b24 | The Future of Software Engineering with AI: Six Predictions | The Pragmatic Engineer by Gergely Orosz (Substack) | 2025 | From the engineering newsletter with the largest practitioner readership (~600,000), Orosz synthesizes how Claude Code, Cursor, and GitHub Copilot are restructuring team workflows, covering agentic ticket execution, role shifts, and the engineering leadership challenges of governing AI toolchains. |
| b25 | The Brownfield Problem: Why Most AI Development Advice Ignores Your Actual Codebase | jjmasse.com (personal engineering blog) | 2026-03 | Identifies the 'brownfield tax' — AI comprehension degrades as legacy file size increases — and documents cross-session forgetting and output stochasticity as brownfield-specific failure modes, with a 19% net slowdown finding for experienced open-source contributors using AI on their own mature repos. |