AI on Deterministic Rails

AI on deterministic rails: how AI and traditional deterministic software are forming a symbiotic stack from January 2025 through June 2026: the enterprise "PoC-opalypse" and the shift from token consumption to durable agentic adoption patterns, AI leveraging software-encoded workflows as guardrails (variance and error control) rather than replacing them, the frontier moving from raw model capability to model orchestration and harness design (Claude Code, OpenCode, Pi), right-sizing with smaller and open-weight models (Llama, Qwen, DeepSeek, Mistral) for cheap routine automation and private inference, and the token-pricing economics behind enterprise sticker-shock over agentic spend versus delivered value

Claude Opus 4.8
financial
frontier
academic
vc
blogs
tech

Synthesised 2026-06-07

Narrative

The dominant finding across McKinsey, Gartner, Bain, and a16z research for 2025–2026 is structural: AI adoption is near-universal but durable enterprise value remains concentrated in a small minority. McKinsey's November 2025 State of AI survey (1,993 respondents) found 88% of organisations using AI in at least one function, yet only 39% reporting EBIT impact and just 5.5% qualifying as high performers with more than 5% EBIT attributable to AI. Gartner's successive forecasts - 30% GenAI PoC abandonment by end of 2025, upgraded to 50% in its 2026 analysis, and a further prediction that over 40% of agentic AI projects will be cancelled by 2027 - provide the clearest quantitative frame for the so-called PoC-opalypse. S&P Global data adds a sharp data point: the share of enterprises scrapping most AI initiatives jumped from 17% in 2024 to 42% in 2025.

The token-cost crisis arrived sharply in early 2026. Per-token prices fell roughly 98% from late 2022 levels, yet enterprise AI bills rose an estimated 320% over the same period, with the average annual enterprise AI budget growing from $1.2 million in 2024 to $7 million in 2026. The mechanism is the agentic multiplier: Gartner's March 2026 analysis found agentic models require 5–30 times more tokens per task than standard chatbots, because each step in a multi-agent loop resends the full context window. Uber exhausting its entire 2026 AI coding budget by April became the emblematic case; the FinOps Foundation reported companies calling in April saying they were already 3x over their full-year token budgets. Goldman Sachs projects global token consumption will multiply 24x by 2030. Bain's parallel finding - that the top 5% of users consume more tokens than the other 95% combined - locates the cost problem precisely in the highest-value, hardest-to-throttle engineers.

The open-weight model story changed the cost calculus for private inference. DeepSeek's V3 release in December 2024 and R1 in January 2025 demonstrated frontier-comparable reasoning trained for approximately $6 million in compute, with inference costs reported at 12.5x cheaper than Claude 3.5 Sonnet. By H1 2026, DeepSeek V4 reduced cost-to-serve by over 10x versus V3.2 through hybrid sparse attention, and the broader open-weight stack (Qwen 3, Llama 4, Mistral) closed to within 5% of closed-frontier models on coding and reasoning benchmarks. Sovereign-cloud deployment patterns consolidated around vLLM and air-gapped clusters running quantised DeepSeek V4 or Qwen 3.6, with switching to self-hosted open models offering 70–90% cost savings for organisations processing millions of tokens daily.

On the investment side, CB Insights documented $66.6 billion in AI funding in Q1 2025 alone - nearly two-thirds of all 2024 AI investment - with agentic AI platforms and multi-agent orchestration recording the highest CB Insights Mosaic health scores across all industries (705+). Sequoia's 2026 AI Ascent named Claude Code and long-horizon agentic coding as the first genuinely discontinuous capability shift, distinct from the incremental gains of 2023–2025. A16z's successive CIO surveys tracked the transition from model interchangeability to harness and workflow lock-in: as one enterprise leader told a16z researchers, all prompts have been tuned for a specific provider, and switching now risks breaking downstream agent dependencies. The symbiotic-stack thesis - AI layered onto deterministic workflow infrastructure rather than replacing it - found clearest expression in a16z's adoption analysis, which identified verifiable outputs and well-defined standard operating procedures as the primary selection criteria for successful deployments.

Sources

ID	Title	Outlet	Date	Significance
v1	How 100 Enterprise CIOs Are Building and Buying Gen AI in 2025	Andreessen Horowitz (a16z)	2026-02	Third annual a16z CIO survey documents the shift from model interchangeability to agentic workflow lock-in, with 81% of enterprises now using three or more model families and innovation-budget share collapsing from 25% to 7% of LLM spend.
v2	Leaders, Gainers and Unexpected Winners in the Enterprise AI Arms Race	Andreessen Horowitz (a16z)	2026-02	Quantifies Anthropic's 25-percentage-point enterprise penetration gain since May 2025, names Claude Code and software development as the primary vector, and notes that 65% of enterprises prefer incumbent solutions for trust and procurement simplicity.
v3	Where Enterprises Are Actually Adopting AI	Andreessen Horowitz (a16z)	2026-04	Frames the adoption thesis around verifiability and defined standard operating procedures, explaining why customer support and software development lead adoption while industries lacking verifiable outputs lag.
v4	The State of AI in 2025: Agents, Innovation, and Transformation	McKinsey Global Institute / QuantumBlack	2025-11	Canonical 2025 enterprise survey (1,993 respondents, 105 countries) finding that 88% use AI but only 39% report EBIT impact; 23% are scaling agentic systems; high performers are 3x more likely to have fundamentally redesigned workflows.
v5	Gartner Predicts 30% of Generative AI Projects Will Be Abandoned After Proof of Concept by End of 2025	Gartner	2024-07	Foundational PoC-abandonment forecast, citing poor data quality, inadequate risk controls, escalating costs, and unclear business value as the four driving causes; later updated to 50% abandonment by Gartner's own revised analysis.
v6	Gartner Predicts Over 40% of Agentic AI Projects Will Be Canceled by End of 2027	Gartner	2025-06	Extends the PoC-failure thesis to the agentic era, warns of widespread 'agent washing' by vendors, and estimates only around 130 of the thousands of self-described agentic AI vendors possess genuine agentic capabilities.
v7	Why Half of GenAI Projects Fail: Avoid These 5 Common Mistakes	Gartner	2026-04	Updated Gartner analysis reporting that at least 50% of GenAI projects were abandoned after PoC by end of 2025, with poor use-case selection and lack of business-value metrics consistently topping the failure list.
v8	State of the Art of Agentic AI Transformation - Technology Report 2025	Bain & Company	2025-09	Introduces a four-level agentic maturity framework (information retrieval to multi-agent constellations), documents that tech-forward enterprises achieved 10–25% EBITDA gains in 2023–2024 by scaling Level 1–2 tools, and warns that cross-system orchestration (Level 3–4) will require fit-for-purpose builds and human-in-the-loop governance.
v9	The Future of Opex in the Agent Economy	Bain & Company	2026-05	Bain's empirical framing of token-cost concentration - top 5% of users consuming more tokens than the other 95% combined - and the forecast that token costs could displace 20–30% of headcount opex, without a clear migration glide-path.
v10	AI in 2026: A Tale of Two AIs	Sequoia Capital	2025-12	Sequoia's annual AI outlook naming adoption fatigue on DIY implementations as a tailwind for packaged AI startups, flagging data-centre delays as a supply constraint, and noting that only coding and ChatGPT have established themselves as undisputed killer applications.
v11	Sequoia AI Ascent 2026: The Future of AI	Sequoia Capital (via AI Opportunities newsletter)	2026-05	Sequoia partner Sonya Huang's 2026 AI Ascent framing: 2022–2024 was chat, 2024–2025 was reasoning models, 2026 is agents; characterises Claude Code and long-horizon agentic coding as the first genuinely discontinuous capability shift, not just an incremental improvement.
v12	Services Are the New Software - Sequoia's Julien Bek on AI-Native Services	Fortune / Sequoia Capital	2026-04	Sequoia partner Julien Bek articulates the margin-compression thesis for AI-service businesses (gross margins around 70% vs. 90% for pure SaaS) and documents real-time enterprise token-rationing behaviour after Anthropic capped Claude Code at peak hours.
v13	State of AI 2025 Report	CB Insights	2026-02	Full-year 2025 overview recording $200B-plus in AI venture funding, AI agent acquisitions comprising 10% of all AI M&A by value, and Salesforce as the most acquisitive buyer with 10 AI deals - reflecting incumbent capture of agentic stack.
v14	State of AI Q1 2025 Report	CB Insights	2025-09	Documents AI funding surging 51% to $66.6B in Q1 2025, with the three largest acquisitions going to agentic AI companies; Mosaic scores for multi-agent systems and orchestration platforms averaging 705+ - among the highest health scores across all industries.
v15	The AI Agent Market Map	CB Insights	2026-03	Maps 400+ private agentic AI companies, noting the landscape tripled from roughly 300 to thousands since March 2025; identifies software development as the most revenue-active agent category and flags that reasoning-model inference costs are already driving pricing pressure in that segment.
v16	AI Token Economics for CFOs	Deloitte	2026-04	Practitioner-level CFO guidance documenting the structural shift from per-seat to consumption pricing, citing AT&T's 8-billion-tokens-per-day workload and subsequent 90% cost reduction through multi-agent architecture as a real-world cost-control case.
v17	Agentic AI Enterprise Token Cost	EY	2026-06	EY's Total Cost of Agents framework argues that token costs are only the visible component of a broader operating stack including infrastructure, governance, and engineering overhead - and that current API pricing may be structurally subsidised.
v18	Token Prices Fell 98%. Enterprise AI Bills Tripled. Now the Industry Wants a Standards Body to Explain Why.	The Next Web	2026-06	Reports that GPT-4-equivalent token costs fell 98% while enterprise AI bills rose an estimated 320%, with average enterprise AI budgets growing from $1.2M in 2024 to $7M in 2026; documents Uber exhausting its full 2026 AI coding budget by April and the Linux Foundation's Tokenomics Foundation launch.
v19	Uber, Microsoft, and Others Burning Through AI Budgets. Now What?	SmarterX	2026-06	Synthesises WSJ and Axios reporting on enterprise token-budget crises; cites Goldman Sachs projection that global token consumption will multiply 24x by 2030 and documents Uber's full-year AI coding budget exhausted in three months as the defining enterprise anecdote.
v20	DeepSeek's Release of an Open-Weight Frontier AI Model	International Institute for Strategic Studies (IISS)	2025-04	Authoritative independent analysis of DeepSeek V3/R1's economic significance: V3 reportedly trained for approximately $6M, with inference costs 12.5x cheaper than Claude 3.5 Sonnet and 15x cheaper than GPT-4o, reframing enterprise model economics.
v21	Open-Weight Models H1 2026: DeepSeek, Qwen, Llama Recap	Digital Applied	2026-05	Technical retrospective documenting that open-weight inference costs dropped roughly an order of magnitude versus H2 2025; details DeepSeek V4's architectural reset achieving 27% of V3.2's single-token inference FLOPs and sovereign-cloud deployment patterns consolidating around vLLM and air-gapped clusters.
v22	DeepSeek's New Models Offer Big Inference Cost Savings	The Register	2026-04	Technical coverage of DeepSeek V4's hybrid attention mechanism (CSA/HCA), reporting cost-to-serve drops over 10x versus V3.2 with roughly 10x less memory - the most significant open-weight serving efficiency improvement of 2026.
v23	Battery Ventures State of Enterprise Tech Spending: Q4 2025	Battery Ventures	2025-12	Survey of 100 CXOs representing $35B+ in annual tech spend, finding 33% already run agentic AI in production, 42% scaling across functions, and enterprises identifying an average of 88 Gen AI use cases with production deployments growing nearly 4x year over year.
v24	CB Insights: AI Agents Are Transforming Enterprise Operations and Driving Infrastructure Demand	CB Insights (via Crowdfund Insider)	2026-02	CB Insights survey of 59 executives finding that 80% consider AI agent adoption a strategic priority but 40% cannot track or are unaware of actual ROI - the clearest quantification of the measurement gap between agentic enthusiasm and documented value.
v25	Agentic AI Market Funding Trends 2026	New Market Pitch	2026-05	Tracks every disclosed agentic AI equity round from January 2024 to May 2026; full-year 2025 funding nearly doubled to $2.9B across 50 deals versus $1.5B in 2024, with January-May 2026 already at $1.1B - 2x the comparable 2025 period.