Research · Blogs & Independent Thinkers

Back to sweep

Research sweep · deep · 2025 – 2026

Agentic Engineering And Enterprise Architecture Discipline

Agentic engineering after Andrej Karpathy's vibe coding meme, April 2025-April 2026: how AI coding agents are changing enterprise software engineering across security, testability, reliability, maintainability, availability, resilience, observability, operability, cost, recovery, and engineering governance.

  • frontier
  • academic
  • vc
  • blogs
  • tech
  • financial

Synthesised 2026-04-30

Narrative

The strongest through-line in independent coverage is that the market has moved beyond the old “vibe coding” meme into a more serious discipline centered on harnesses, context, tests, and operational controls. Simon Willison and Addy Osmani are the clearest on the terminology shift: vibe coding means ignoring the code, while the professional mode requires specs, diff review, test suites, and explicit accountability. Thoughtworks’ Birgitta Böckeler adds the operational layer, arguing that context engineering and harness engineering are now the practical center of gravity for agentic work. Cloudflare’s April 2026 posts show the same pattern in enterprise infrastructure terms: durable execution, sandboxed code, identity-aware auth, and long-running sessions are becoming the substrate for production agents.

The second major theme is that agentic engineering increases the importance of classic engineering disciplines rather than replacing them. Security-focused work from Thoughtworks and OpenAI highlights new attack surfaces, prompt/context poisoning, and the need for monitoring and least privilege. LessWrong and METR-linked analysis push back on inflated benchmark claims, showing that agents often look better on algorithmic scores than on real code quality, maintainability, and usability. Across the sample, the credible claim is not that agents remove software engineering constraints, but that they make those constraints more visible and less optional.


Sources

ID Title Outlet Date Significance
b1 Not all AI-assisted programming is vibe coding (but vibe coding rocks) Simon Willison's Weblog 2025-03 Defines vibe coding narrowly and argues for separating reckless prompt-only coding from disciplined AI-assisted engineering.
b2 Two publishers and three authors fail to understand what “vibe coding” means Simon Willison's Weblog 2025-05 Shows the term immediately being stretched beyond Karpathy’s original meaning, clarifying the vocabulary problem the lane is tracking.
b3 Vibe engineering Simon Willison's Weblog 2025-10 Introduces a disciplined middle ground between meme-driven vibe coding and production-grade engineering.
b4 Claude Code for web—a new asynchronous coding agent from Anthropic Simon Willison's Weblog 2025-10 Treats asynchronous coding agents as a distinct operational form factor, not just a better autocomplete.
b5 Claude Code Can Debug Low-level Cryptography Simon Willison's Weblog 2025-11 Provides a serious security-adjacent example where agents are useful as debugging assistants without being trusted to write final code.
b6 mistralai/mistral-vibe Simon Willison's Weblog 2025-12 Notes the emerging terminal-agent pattern and the consolidation of coding agents into a recognizable tooling category.
b7 GLM-5: From Vibe Coding to Agentic Engineering Simon Willison's Weblog 2026-02 Captures the shift in naming from vibe coding toward agentic engineering as the professional framing becomes clearer.
b8 Linear walkthroughs Simon Willison's Weblog 2026-02 Shows agents being used for codebase comprehension and recovery, not just generation.
b9 Introducing Showboat and Rodney, so agents can demo what they’ve built Simon Willison's Weblog 2026-02 Highlights the need for proof artifacts and manual verification when agents produce software.
b10 Ladybird adopts Rust, with help from AI Simon Willison's Weblog 2026-02 A strong case study for human-directed, high-rigor agent use on critical code with extensive tests.
b11 Agentic Engineering AddyOsmani.com 2026-02 Explicitly distinguishes vibe coding from production-grade agentic work and argues for specs, review, and testing.
b12 Stop Using /init for AGENTS.md AddyOsmani.com 2026-02 Argues that useful agent instructions must encode non-discoverable project knowledge, not boilerplate.
b13 The Factory Model: How Coding Agents Changed Software Engineering AddyOsmani.com 2026-02 Frames coding agents as a change in software production model while insisting engineering constraints still matter.
b14 Scaffolding AddyOsmani.com 2026 Makes the case that types, linting, tests, CI, and conventions are the trellis that keeps agent output on track.
b15 Harness engineering for coding agent users martinfowler.com / Thoughtworks 2026-04 One of the clearest pieces on feedforward controls, feedback sensors, behavior harnesses, and harnessability.
b16 Context Engineering for Coding Agents martinfowler.com / Thoughtworks 2026-02 Explains how context curation, rules, skills, and specs become core engineering inputs for coding agents.
b17 Autonomous coding agents: A Codex example martinfowler.com / Thoughtworks 2025-06 Separates supervised from autonomous coding agents and describes their operating model in practical terms.
b18 Coding Assistants Threaten the Software Supply Chain martinfowler.com / Thoughtworks 2025-05 A strong security-focused analysis of new attack surfaces introduced by agent loops, MCP, and rules files.
b19 Building your own CLI Coding Agent with Pydantic-AI martinfowler.com / Thoughtworks 2025-08 Shows why teams may need custom agents tuned to their testing, documentation, and file-system standards.
b20 Exploring Generative AI martinfowler.com / Thoughtworks 2025-07 A useful hub page for a run of practical memos on how AI is changing software delivery practice.
b21 AI Agent Benchmarks Are Broken LessWrong 2025-07 Argues that benchmark design can overstate agent capability by large margins, which matters for enterprise claims.
b22 METR Research Update: Algorithmic vs. Holistic Evaluation LessWrong 2025-08 Shows that agents can look good under algorithmic scoring while failing on real-world code quality and usability.
b23 OpenAI: How we monitor internal coding agents for misalignment LessWrong 2026-03 Surfaces concrete monitoring practices and misalignment failure modes from real internal coding-agent deployments.
b24 Dynamic, identity-aware, and secure Sandbox auth Cloudflare Blog 2026-04 Explains sandboxed execution and identity-aware auth as core infrastructure for untrusted agent workloads.
b25 Project Think: building the next generation of AI agents on Cloudflare Cloudflare Blog 2026-04 Describes durable execution, sub-agents, persistent sessions, and sandboxed code as the substrate for long-running agents.

We use analytics cookies to understand site usage and improve the service. We do not use marketing cookies.