Research · Blogs & Independent Thinkers
Back to sweepResearch sweep · deep · 2024 – 2026
Designing AI Operating Models Around Humans
How humans are adapting to AI between June 2024 and June 2026, weighing measured benefits and harms, and how organizations should design operating models around human cognitive load and behavioural patterns rather than forcing adoption, covering cognitive overload from supervising multiple agents at machine speed (context switching, automation complacency, vigilance fatigue), the poor budget and value outcomes of top-down AI mandates and token-maximizing usage, the gap between model welfare functions (such as Anthropic's) and any equivalent human or worker welfare function, and how much good human outcomes depend on model training versus orchestration and deployment design.
- GPT-5.5
- financial
- frontier
- academic
- vc
- blogs
- tech
Synthesised 2026-06-15
Narrative
The strongest independent writing in this lane converges on a simple point: measured gains from AI are real, but they depend on how work is structured around it. Ethan Mollick's "Management as AI superpower" argues that the scarce resource is no longer raw execution but the ability to specify, delegate, and judge outputs. Simon Willison makes the same point from software practice, writing that agentic engineering makes code cheap and shifts the bottleneck to testing, verification, and deep domain understanding.
Several sources also describe a human-attention problem that gets worse as agents speed up. In "Claude Dispatch and the Power of Interfaces", Mollick cites a valuation-task study in which chatbot interaction imposed a mental tax, especially on less experienced workers, because users had to sift through walls of text and branching suggestions. Willison's April 2026 account is blunter: running four coding agents in parallel was enough to leave a 25-year software veteran wiped out by 11 a.m., which lines up with the June 2026 arXiv paper showing that unconstrained multi-agent workflows fail far more often than workflows with deterministic steps and explicit human gates.
On organisational design, the sources lean against top-down adoption theatre. Recent Business Insider reporting says executives at Replit, BNP Paribas CIB, La Banque Postale, TCS, and NTT DATA are rejecting token counts and leaderboards as ROI metrics, while the April 2026 arXiv token-consumption paper finds that higher token use does not reliably buy higher accuracy. Narayanan and Kapoor's "AI as Normal Technology" and Evans's "Predicting AI job exposure" both push in the same direction: value comes slowly, through redesign of workflows and institutions, not from mandates or role-level forecasts. Jonathon Ready's essay on Anthropic's silent restrictions adds a different warning, that lab-side welfare and policy choices can diverge from user welfare unless deployment makes those trade-offs visible and contestable.
Sources
| ID | Title | Outlet | Date | Significance |
|---|---|---|---|---|
| b1 | Management as AI superpower | One Useful Thing | 2026-01 | Ethan Mollick argues that agentic work shifts value from execution to delegation, evaluation, and specification, grounding the organisational case for treating management skill and subject matter expertise as the scarce resource. |
| b2 | Claude Dispatch and the Power of Interfaces | One Useful Thing | 2026-03 | Mollick uses a valuation-task paper to show that chatbot UX can erase AI productivity gains by overloading users with sprawling outputs, making interface design a first-order operating-model issue. |
| b3 | Choosing to Stay Human | One Useful Thing | 2026-05 | This essay frames AI adoption as a choice about where to preserve human skill formation, warning that convenience can erode writing judgement and flood attention with low-meaning synthetic content. |
| b4 | The lethal trifecta for AI agents: private data, untrusted content, and external communication | Simon Willison’s Weblog | 2025-06 | Willison reduces a broad agent-safety debate to a concrete deployment rule: combining private-data access, untrusted input, and external communication creates a prompt-injection exfiltration hazard. |
| b5 | Writing about Agentic Engineering Patterns | Simon Willison’s Weblog | 2026-02 | Willison argues that coding agents lower the cost of producing working code to near zero, shifting the bottleneck to verification and changing how teams should structure engineering work. |
| b6 | Highlights from my conversation about agentic engineering on Lenny’s Podcast | Simon Willison’s Weblog | 2026-04 | Willison gives a practitioner account of the cognitive cost of supervising multiple agents in parallel, describing four concurrent coding agents as enough to wipe out an experienced engineer by late morning. |
| b7 | If Claude Fable stops helping you, you'll never know | Jonathon Ready | 2026-06 | Ready identifies a direct conflict between a lab's hidden model-side restrictions and user welfare, arguing that silent degradation creates supply-chain risk for ordinary software teams building AI features. |
| b8 | AI as Normal Technology | AI as Normal Technology | 2025-04 | Arvind Narayanan and Sayash Kapoor argue against technological determinism, separating model progress from application design and adoption, which is central to judging whether harms are fixed in training or in deployment. |
| b9 | New Paper: Towards a science of AI agent reliability | AI as Normal Technology | 2026-02 | Kapoor, Narayanan, and Stephan Rabanser argue that capability gains have outpaced reliability gains, offering a framework for why impressive demos do not automatically translate into dependable organisational use. |
| b10 | Why AI hasn’t replaced software engineers, and won’t | AI as Normal Technology | 2026-06 | This essay rejects the threshold story of sudden labour replacement, arguing instead that AI compresses execution while decision-making, verification, and accountability remain stubbornly human. |
| b11 | Predicting AI job exposure | Benedict Evans | 2026-05 | Evans argues that job-exposure charts miss how automation changes business models, regulation, and task composition, which makes simple role-level forecasts unreliable for planning. |
| b12 | (Human) Attention Is (Still) All You Need: Human oversight makes AI-assisted social science reliable | arXiv | 2026-06 | This paper reports that an unconstrained multi-agent baseline failed in 72 percent of runs, while a harness with deterministic computation and three human decision gates cut failures to 16 percent. |
| b13 | Oversight Structures for Agentic AI in Public-Sector Organizations | arXiv | 2025-06 | This paper argues that agentic AI requires continuous oversight, tighter integration of governance with operations, and cross-departmental coordination rather than episodic compliance review. |
| b14 | As companies rethink AI ROI, Replit's AI chief calls token leaderboards 'very dystopian' | Business Insider | 2026-06 | The report captures an emerging backlash against token-maximising mandates, with Replit's Michele Catasta arguing that raw token burn is a misleading proxy for productivity or value. |
| b15 | I asked 4 executives how they measure AI ROI. None started with AI tokens. | Business Insider | 2026-06 | This report shows large organisations moving from activity metrics to outcome metrics, with BNP Paribas CIB, La Banque Postale, TCS, and NTT DATA all rejecting token counts as the main ROI measure. |
| b16 | How Do AI Agents Spend Your Money? Analyzing and Predicting Token Consumption in Agentic Coding Tasks | arXiv | 2026-04 | The paper finds that agentic coding runs can consume 1000 times more tokens than simpler code interactions, with high variance and no reliable link between higher token use and better accuracy. |