Compounding Waves — How Each Tech Era Built the Substrate, and the Skills, for the Next

The compounding economic logic of three successive technology waves from January 1995 to May 2026 — internet disintermediation of distribution, software-defined platforms and cloud infrastructure, and the current AI/agentic systems wave — examining the technical, economic and human-skills dependencies that make each wave a precondition for the next, the new categories of work each wave created, and whether the relationship is best understood as cumulative compounding or as externalised costs harvested by later layers.

financial
academic
blogs
vc

Synthesised 2026-05-11

The Compounding Logic of Three Technology Waves, 1995–2026

Overview

Three technology waves have run in sequence since January 1995: internet disintermediation of distribution (1995–2005), software-defined platforms and cloud infrastructure (2005–2015), and the AI/agentic systems wave now consuming roughly half of all venture capital. Each wave produced artefacts that became load-bearing inputs for the next. Payment rails, web standards and the open-publishing norm of wave one made hyperscale cloud economically viable. CUDA, PyTorch, Kubernetes, distributed storage and the API economy of wave two made frontier model training and inference tractable. The current wave depends on both prior substrates simultaneously, and on a training corpus assembled over thirty years by people who had no idea they were building it.

The defining shift of the past eighteen months is that the externality is being priced. Common Crawl, Wikipedia, GitHub, Stack Overflow and Reddit were treated as free inputs through 2022. By late 2024, Longpre and colleagues found that robots.txt and Terms of Service restrictions on AI crawling had risen more than 500% in under a year, with 28%+ of the most critical C4 sources fully restricted. OpenAI signed a $60M licensing deal with Reddit in May 2024, Reddit then sued Anthropic and Perplexity in 2025, and the New York Times added Perplexity to its litigation roster in December 2025. The wave-one commons is closing, and wave three is the last generation that will train on it cheaply. Sources: arXiv / NeurIPS 2024 Datasets and Benchmarks Track (2024) (↗); SiliconANGLE (2024) (↗); CNBC (2025) (↗); MediaNama (2025) (↗)

The capital story is now historic. Hyperscaler AI capex tracks toward $650–700B in 2026, roughly double 2025, against a Bain estimate that the world needs $2T in new annual AI revenue by 2030 to fund the compute build. Sequoia's David Cahn raised his revenue-gap question from $200B in September 2023 to $600B in June 2024 without finding the answer. Alphabet's 2026 capex now outpaces its trailing twelve-month free cash flow, and Man Group documents how hyperscalers began moving liabilities into SPVs, private credit and data-centre REITs through 2025. Sources: CoStar (citing Bloomberg and company filings) (2026) (↗); Bain & Company (2025) (↗); Sequoia Capital (2023) (↗); Sequoia Capital (2024) (↗); Man Group (2025) (↗)

The productivity evidence has not caught up. McKinsey's 2025 State of AI finds 88% of organisations using AI in at least one function but only 6% qualifying as high performers with measurable EBIT impact. An NBER survey of 6,000 executives in early 2026 reports average AI usage of 1.5 hours per week. Acemoglu's TFP estimate of 0.5–0.6% over a decade sits an order of magnitude below Goldman Sachs's 7% global GDP figure. The wave is real, the spending is real, the productivity is not yet legible in the macro data. Sources: McKinsey Global Institute (2025) (↗); Fortune (2026) (↗); Fortune (2024) (↗); Goldman Sachs (2023) (↗)

Key Findings

The dependency chain is concrete, not metaphorical. Ben Thompson's clearest articulation traces it through a single corporate graph: Amazon funds Anthropic because AWS generated both the capital and the inference infrastructure, and AWS exists because the wave-one internet created the demand for elastic compute and the payment rails to monetise it. The same chain runs through any frontier coding assistant: HTTP and Git from wave one, GitHub and AWS from wave two, CUDA-trained transformers in wave three, all consuming Stack Overflow answers written by humans between 2008 and 2022. Sources: Stratechery (Ben Thompson) (2026) (↗); Stratechery (Ben Thompson) (2023) (↗); Stratechery (Ben Thompson) (2015) (↗)

Scaling laws made the wave-one open web load-bearing, not optional. Kaplan et al. (2020) and the Chinchilla paper (2022) established that data must grow proportionally with model size for compute-optimal training. This converted Common Crawl from a research convenience into the binding input constraint. A FAccT 2024 critical analysis of Common Crawl confirms it remains the single largest source for generative AI training data. Sources: arXiv (OpenAI) (2020) (↗); arXiv (DeepMind) (2022) (↗); ACM FAccT 2024 (2024) (↗)

The data commons is being enclosed at a measurable rate. Longpre's audit of 14,000 domains found AI-crawling restrictions on the most critical C4 sources jumped above 28% within ten months. Wikimedia reported in 2025 that 65% of its most expensive bandwidth now serves AI crawlers. A subsequent arXiv study (October 2025) shows the political composition of training data shifting as moderate news sites withdraw first, leaving hyperpartisan material disproportionately accessible. Sources: arXiv / NeurIPS 2024 Datasets and Benchmarks Track (2024) (↗); arXiv (2025) (↗); arXiv (2025) (↗); ACM Internet Measurement Conference 2025 (2025) (↗)

Value capture has concentrated in the substrate layer for three consecutive waves. In wave one, the constraint was distribution and Amazon and Google captured it. In wave two, the constraint was compute abstraction and AWS, Azure and GCP captured it. In wave three, the constraint is hyperscale compute and proprietary training data, and the same hyperscalers are capturing it again. The 'Hype, Sustainability' arXiv paper quantifies the loop: three major cloud providers contributed two-thirds of the $27B raised by AI startups in 2023, and up to 80–90% of early-stage AI startup capital flows back to those cloud providers as compute spend. Sources: arXiv (2024) (↗); Internet Policy Review (2024) (↗); Big Data & Society (2024) (↗)

Labour effects are real but bounded, and concentrated at career entry. Brookings' March 2026 synthesis finds no aggregate displacement through 2024–2025 but a 16% employment decline for workers aged 22–25 in AI-exposed occupations. The Harvard Business School displacement-complementarity study finds a 24% reduction in automatable skills per firm per quarter post-ChatGPT against a 15% increase in augmentation-exposed roles. The 'Crashing Waves vs Rising Tides' preprint shows capability gains arriving broadly across task durations rather than as abrupt occupational displacement. Sources: Brookings Institution (2026) (↗); Harvard Business School Working Paper (2024) (↗); arXiv (2026) (↗)

Each wave invented its own intermediary skill class, and each was diffused within a decade. Webmaster and e-commerce merchandiser in wave one; cloud engineer, SRE and data scientist in wave two; ML engineer, prompt engineer, AI evaluation engineer and what McKinsey calls 'business–AI translators' in wave three. Gartner projects 80% of engineering workers will require upskilling by 2027. The transition cost falls on workers; the productivity returns accrue to capital. Sources: McKinsey Global Institute (2025) (↗); McKinsey Global Institute (2025) (↗); Great Leadership (Substack) (2026) (↗)

The productivity paradox has a J-Curve precedent but no firm timeline. Brynjolfsson's framework predicts large invisible complementary investments before productivity appears in aggregate data, validated by the twenty-year lag after Edison's Pearl Street Station and the decade-long lag after PC adoption. The current absence of macro signal is consistent with both 'eventually large' and 'never appeared' readings. METR's July 2025 finding that AI tools made experienced open-source developers 19% slower on familiar repositories complicates any clean diffusion story. Sources: American Enterprise Institute (2025) (↗); arXiv / METR (2025) (↗); International Center for Law and Economics (2026) (↗)

Not every contemporaneous wave became substrate. Mobile, IoT, VR and blockchain ran in parallel to cloud and AI but did not compound into the next layer. The distinguishing feature is whether the technology lowered the fixed cost of the next wave's general-purpose operations or served a narrower use case. Cloud lowered the cost of running anything; blockchain lowered the cost of a specific class of trust operations. Substrate-becoming requires general-purpose cost reduction, not application novelty. Sources: Stratechery (Ben Thompson) (2015) (↗); Stratechery (Ben Thompson) (2020) (↗)

The next binding constraint is energy and physical infrastructure, not algorithms. CB Insights Q3 2025 data records nuclear energy investment tracking toward $5B annually, driven entirely by hyperscaler power demand. Bain projects 200 gigawatts of compute demand by 2030 and an $800B capital shortfall even if enterprises redirect all on-premise IT and AI savings into cloud. Sources: CB Insights (2025) (↗); Bain & Company (2025) (↗); Bain & Company (2025) (↗)

Evidence & Data

The capex numbers are the cleanest indicator of capital concentration. CoStar puts 2026 hyperscaler AI capex at $680B. Bloomberg Intelligence projects the AI accelerator market alone exceeds $600B by 2033. CB Insights records AI at 48% of total venture funding in 2025, $226B of $469B, with the six largest rounds, OpenAI $41B, Anthropic $32.5B, Scale $14.8B, xAI $12.8B, Databricks $5B and Aligned $5B, accounting for 49% of all AI capital. Sources: CoStar (citing Bloomberg and company filings) (2026) (↗); Bloomberg Intelligence (2026) (↗); CB Insights (2026) (↗); CB Insights (2026) (↗)

The capability numbers from METR's benchmarking programme give the velocity dimension. AI task-completion time horizons have doubled every seven months since 2019. As of early 2025, Claude 3.7 Sonnet achieves 50% success on tasks that take human experts roughly 50 minutes. RE-Bench and HCAST extend this measurement framework into agentic R&D capabilities. Sources: arXiv / METR (2025) (↗); arXiv / METR (2025) (↗); arXiv / METR (2024) (↗)

The cost-curve numbers from Willison's annual reviews complement the capability data. Inference costs for GPT-4-class capability fell roughly 100x between 2022 and 2025, with 18 separate organisations reaching that capability tier. This is the same pattern Andreessen described for application hosting between 2000 and 2011, from $150,000 to $1,500 per month, that made wave-two SaaS economics work. Sources: Simon Willison's Newsletter (Substack) (2026) (↗); Simon Willison's Weblog (2024) (↗); Andreessen Horowitz (a16z) (2011) (↗)

The productivity gap is the most contested set of numbers. Goldman Sachs estimates 7% global GDP uplift and 1.5 percentage points of annual US labour productivity. Acemoglu, in the IMF and ICLE syntheses, estimates 0.5–0.6% TFP over a decade. McKinsey finds only 6% of AI-using organisations show measurable EBIT impact. The NBER 6,000-executive survey reports 1.5 hours of weekly AI usage on average. Sources: Goldman Sachs (2023) (↗); Fortune (2024) (↗); McKinsey Global Institute (2025) (↗); Fortune (2026) (↗); IMF Finance & Development (2026) (↗)

The data-enclosure numbers, Longpre's 500%+ rise in robots.txt restrictions, 28%+ C4 source restriction, Wikimedia's 65% AI-crawler share of expensive bandwidth, are the empirical backbone of the externalised-harvest argument. They show the externality being priced in real time. Sources: arXiv / NeurIPS 2024 Datasets and Benchmarks Track (2024) (↗); arXiv (2025) (↗)

Signals & Tensions

Compounding dividend versus externalised harvest. Both frames have strong empirical support and the evidence does not cleanly distinguish them. Stratechery and Willison weight the dividend frame, partly because they benefit from the tools. The Augmented Mind Substack and the political-economy literature in Big Data & Society weight the harvest frame, citing the same hyperscaler concentration patterns. The honest reading is that wave three is genuinely compounding on prior infrastructure and extracting uncompensated value from prior content, simultaneously. Sources: Augmented Mind (Substack) (2025) (↗); Internet Policy Review (2024) (↗); Simon Willison's Newsletter (Substack) (2026) (↗)

Agentic AI supply versus demand. Gartner downgraded agentic AI from its top strategic trend in 2024 to a market-correction warning in October 2025, while simultaneously projecting agentic AI will intermediate $15T in B2B purchases by 2028 and drive $450B+ in enterprise application revenue by 2035. The tension is unresolved in Gartner's own publications. Sources: Gartner (2024) (↗); Gartner (2025) (↗); Gartner / Digital Commerce 360 (2025) (↗); Gartner (2025) (↗)

Benchmarks versus field productivity. METR's time-horizon doubling shows accelerating capability. METR's same programme found AI tools made experienced developers 19% slower on familiar codebases. The gap between benchmark performance and observed enterprise productivity is the central unresolved measurement question. Sources: arXiv / METR (2025) (↗); arXiv / METR (2025) (↗)

Synthetic data as substitute is empirically uncertain. The 2025 arXiv paper on scaling laws of synthetic data establishes early bounds, but no current frontier model relies primarily on synthetic corpora. LessWrong contributors converge on a 2026–2032 window for general-purpose internet training data exhaustion. Whether synthetic, licensed and RLHF corpora can substitute at frontier scale is the live empirical question. Sources: arXiv (2025) (↗); LessWrong (2025) (↗); LessWrong (2025) (↗)

The two-tier licensing system. OpenAI and Google sign $60M+ deals with Reddit; Perplexity and Anthropic face litigation for accessing the same public corpus. The Troutman Pepper Locke analysis of the Reddit suit frames this as a structural divide between well-capitalised incumbents and smaller entrants. Whether this concentrates frontier capability or merely raises everyone's cost is unresolved. Sources: SiliconANGLE (2024) (↗); Law360 / Troutman Pepper Locke (2025) (↗); CNBC (2025) (↗)

Bank-sector labour signals diverge. Bloomberg Intelligence projects 200,000 investment banking jobs lost over three to five years. Goldman's own CIO simultaneously argues for workforce amplification, and Goldman deployed an internal AI assistant in January 2025. The financial sector is running both experiments in parallel and the empirical signal is not yet clean. Sources: CNBC (2025) (↗); Brookings Institution (2026) (↗)

Open Questions

Does the productivity J-Curve trough have a defined floor? Brynjolfsson's framework predicts eventual emergence in aggregate data but offers no committed timeline. Acemoglu's 0.5% TFP estimate implies the curve may be flatter than wave one and two. Neither side has a falsifiable test. Sources: American Enterprise Institute (2025) (↗); Causal Inference (Substack) (2025) (↗)

Who funds the next data substrate? Wikimedia, Stack Exchange and similar public commons are now bearing infrastructure costs from AI crawlers without a revenue model. The Generative AI and the Digital Commons paper raises the question without answering it. Sources: arXiv (2025) (↗); ACM Internet Measurement Conference 2025 (2025) (↗)

Can synthetic and licensed data sustain frontier scaling? The empirical record is too short to know whether models trained primarily on synthetic corpora reach the same capability frontier as those trained on open-web data. Sources: arXiv (2025) (↗)

Will the new role categories absorb displacement at wave-one and wave-two rates? Prompt engineer, AI evaluation engineer and model auditor are real categories but their absolute employment numbers remain small compared to webmaster or cloud engineer at equivalent wave stages. The Brookings synthesis is explicit that the labour-economics literature is in its first inning. Sources: Brookings Institution (2026) (↗); arXiv (2024) (↗)

Is the off-balance-sheet financing structure stable? Man Group's documentation of SPVs, private credit and data-centre REITs absorbing hyperscaler risk implies institutions that did not price GPU cycles are now carrying them. A correction event would test whether the structure holds. Sources: Man Group (2025) (↗); Defiance ETFs (Substack) (2025) (↗)

Does Gans–Goldfarb O-ring automation generalise? The model implies automating easy tasks concentrates human effort on bottleneck tasks rather than displacing labour. If it generalises, the simple displacement narratives in Frey–Osborne and downstream work are structurally wrong. Sources: Tom Bewick (Substack) (2026) (↗)

![[sources-the-compounding-economic-logic-of-three-successive]]

Sources

Summary: ↑ Back to summary

Financial Press

ID	Title	Outlet	Date	Significance
f1	AI Accelerator Market Looks Set to Exceed $600 Billion by 2033, Driven by Hyperscale Spending and ASIC Adoption	Bloomberg Intelligence	2026-01	Provides authoritative market sizing for the AI infrastructure layer, projecting $3.5 trillion in hyperscaler capital expenditure through 2030 and tracing the shift from GPU to custom ASIC architectures as the compounding substrate of wave three.
f2	Global AI Data Center Dominance Shifts Away From Big Tech	Bloomberg	2025-12	Documents the structural financialisation of AI infrastructure — over $178 billion in US data-centre credit deals in 2025 alone — and the entry of inexperienced operators into the buildout, raising systemic risk questions central to the 'who pays for the next substrate' question.
f3	Where enterprise data is headed in 2026	Bloomberg Professional Services	2025-12	Based on Bloomberg's own Enterprise Tech and Data Summit, this piece marks the transition from AI experimentation to enterprise-wide adoption in financial institutions — a key indicator of where wave three value is beginning to accrue.
f4	Gen AI: Too Much Spend, Too Little Benefit?	Goldman Sachs Global Investment Research	2024-06	The most-cited financial-press intervention on the AI productivity paradox, assembling both the bull case (6.1 percent GDP uplift) and the Acemoglu counter-case (0.9 percent), and naming the conditions under which the trillion-dollar infrastructure bet could fail to generate returns.
f5	Generative AI Could Raise Global GDP by 7%	Goldman Sachs	2023-04	The foundational Goldman Sachs productivity forecast — 1.5 percentage-point annual productivity uplift, $7 trillion GDP gain — that anchored the financial-press bull case for wave three and remains the benchmark against which sceptical evidence is measured.
f6	Generative AI: Hype, or Truly Transformative?	Goldman Sachs Global Investment Research	2023-07	Goldman's early framing of AI as 'Software 3.0', explicitly comparing the wave to the PC-internet productivity boom of 1996–2005 and warning that investor timetables for returns typically exceed reality — directly relevant to the compounding-wave thesis.
f7	Hyperscalers' $680 Billion AI Capital Expenditure Investment Raises the Stakes	CoStar (citing Bloomberg and company filings)	2026-02	Detailed breakdown of each hyperscaler's capital programme and free-cash-flow gap for 2026, showing that capex is outpacing internal cash generation and forcing debt issuance — the clearest available data point on who is financing the current substrate layer.
f8	The Magnificent Capex: AI Infrastructure Spending and Who Actually Benefits	Ferguson Wellman Capital Management	2026-05	Frames the AI capex cycle in three layers — builders, enablers, adopters — and highlights that Amazon, Alphabet, Microsoft and Meta will collectively spend $650–700 billion on capex in 2026 alone, nearly double 2025, with significant cost-inflation embedded in that figure.
f9	2025: The State of Generative AI in the Enterprise	Menlo Ventures	2025-12	Provides the most granular demand-side data available: enterprise AI spend grew from $1.7 billion to $37 billion since 2023, coding became the first genuine 'killer use case', and 50 percent of developers now use AI tools daily — directly mapping value-creation to the wave-three application layer.
f10	AI Investment and Deal Trends: Global Report H1 2025	Ropes & Gray	2025-08	Documents the M&A and investment landscape for the first half of 2025, including Microsoft, Alphabet, Amazon and Meta committing a combined $320 billion to AI infrastructure — and Morgan Stanley's framing of agentic AI as 'the next frontier' for enterprise value capture.
f11	AI Can Lift Global Growth	IMF Finance & Development	2026-03	IMF analysis showing that hyperscaler valuations have been rewarded by equity markets 'unseen since the dot-com era', and that GDP data simultaneously overstates AI's immediate capital contribution while understating its productivity spillovers — directly relevant to the intangible-capital measurement problem.
f12	Artificial Intelligence and Productivity in Europe	IMF Working Paper	2025	Cross-country econometric analysis comparing Acemoglu's conservative productivity estimates with McKinsey and Goldman Sachs projections, with specific attention to why higher-wage, financial-services-heavy economies are disproportionately exposed — a critical input for the wave-three value-capture debate.
f13	The Next Phase of AI: Technology, Infrastructure, and Policy in 2025–2026	American Action Forum	2026-04	Documents the regulatory and policy dimension of the AI infrastructure buildout, including the projection that agentic AI will represent 10–15 percent of IT spending by 2026 — connecting the technical dependency chain to emerging enterprise spending patterns.
f14	Thousands of CEOs Admit AI Had No Impact on Employment or Productivity	Fortune	2026-04	Synthesises an NBER survey of 6,000 executives alongside Apollo chief economist Torsten Slok's 'AI is everywhere except in the macroeconomic data' observation, and cites a Financial Times analysis showing 374 S&P 500 companies claiming positive AI adoption without aggregate productivity evidence.
f15	AI's Economic Potential: Goldman Sachs Responds to Daron Acemoglu	American Enterprise Institute (summarising Goldman Sachs research)	2024-06	The clearest head-to-head presentation of the Acemoglu (0.5 percent TFP, 1 percent GDP) versus Goldman Sachs (9 percent productivity, 6.1 percent GDP) forecasts, illuminating the empirical assumptions — task share, automation cost savings, labour reallocation — that separate the two positions.
f16	Markets Have Overestimated AI-Driven Productivity Gains, Says MIT Economist	Fortune	2024-08	Daron Acemoglu writing directly in the financial press, arguing that Total Factor Productivity growth from the full AI suite is likely 0.5–0.6 percent over ten years — the anchor citation for sceptical financial commentary on whether wave three can justify its infrastructure spend.
f17	Productivity Paradox	American Enterprise Institute	2025-05	Applies Brynjolfsson, Rock and Syverson's 'Productivity J-Curve' framework to current AI adoption data, including a Wall Street Journal story on IBM's Arvind Krishna — AI reduced headcount in HR but freed capital for new hires in engineering and sales, illustrating the intra-wave role-creation dynamic.
f18	Research on AI and the Labor Market Is Still in the First Inning	Brookings Institution	2026-03	Comprehensive literature review from Brookings, synthesising Brynjolfsson, Chandar and Chen's ADP payroll finding (16 percent employment decline for workers aged 22–25 in AI-exposed occupations) against studies showing no aggregate displacement — directly relevant to the intra-wave skills and role-creation question.
f19	AI, Productivity, and Labor Markets: A Review of the Empirical Evidence	International Center for Law and Economics	2026-02	Systematic review of the empirical literature through 2025, finding no economy-wide displacement but concentrated entry-level effects and task reallocation — the most current synthesis of what the labour-economics evidence actually shows at the start of wave three.
f20	Goldman Sachs Rolls Out an AI Assistant for Its Employees	CNBC	2025-01	Reports Bloomberg Intelligence's estimate that global investment banks may shed 200,000 jobs in three to five years, alongside Goldman CIO Marco Argenti's 'amplified workforce' counter-framing — a direct illustration of the divergence between corporate AI rhetoric and the financial sector's own research on displacement.
f21	OpenAI Agrees to Deal with Reddit to Scrape Its Content for AI Training	SiliconANGLE	2024-05	Documents the $60 million OpenAI-Reddit licensing deal — the first major commercial enclosure of a wave-one open-web data source — establishing the precedent that wave-three models must now pay for what they previously harvested for free.
f22	Reddit Accuses Perplexity of Stealing User Posts, Expanding Data Rights Battle with AI Industry	CNBC	2025-10	Primary news report on Reddit's October 2025 lawsuit against Perplexity, documenting the 'arms race for quality human content' and the two-tier licensing system that now gives well-capitalised incumbents structural advantages in accessing wave-one training data.
f23	New York Times Sues Perplexity AI for Copyright Infringement	MediaNama	2025-12	Reports the NYT's December 2025 suit against Perplexity, extending the copyright-and-training-data legal front opened by the original NYT v. OpenAI action and illustrating how wave-one content producers are actively closing the externality that wave-three depends on.
f24	The Future of Gen AI Training Amid Reddit Data Scraping Suit	Law360 / Troutman Pepper Locke	2025-12	Legal analysis noting that the Wall Street Journal reported OpenAI's 2025 losses could reach $74 billion, setting up the tension between escalating content-licensing costs and AI labs' existing financial losses — a concrete framing of the 'externalised harvest' problem.
f25	The AI Bubble: Hidden Risks and Opportunities	Man Group	2025-11	Institutional investor analysis arguing that GPU economic life of approximately one year creates a fundamental mismatch between short-duration assets and long-duration debt financing — and that risk is migrating from tech balance sheets into utilities, insurers, pension funds and private credit, raising systemic questions about who ultimately pays for the AI substrate.

Academic & arXiv

ID	Title	Outlet	Date	Significance
a1	Scaling Laws for Neural Language Models	arXiv (OpenAI)	2020-01	Established power-law relationships between model performance and compute, data, and parameters, providing the empirical foundation for the hyperscale training regime that defines the AI wave's infrastructure dependency.
a2	Training Compute-Optimal Large Language Models (Chinchilla)	arXiv (DeepMind)	2022-03	Revised compute-optimal training ratios, demonstrating that frontier models require data to scale proportionally with parameters, making data supply a binding constraint co-equal with compute.
a3	Consent in Crisis: The Rapid Decline of the AI Data Commons	arXiv / NeurIPS 2024 Datasets and Benchmarks Track	2024-07	First large-scale longitudinal audit of 14,000 web domains showing that in a single year (2023–2024) robots.txt and Terms of Service restrictions rose 500%+, directly measuring the closure of the open-web externality that wave-three AI consumed for free.
a4	RE-Bench: Evaluating Frontier AI R&D Capabilities of Language Model Agents Against Human Experts	arXiv / METR	2024-11	METR's benchmark comparing AI agents with 61 human experts on ML research engineering tasks, providing the primary empirical evidence on how far agentic systems have advanced toward automating the software-engineering labour that underpins the cloud and AI waves.
a5	HCAST: Human-Calibrated Autonomy Software Tasks	arXiv / METR	2025-03	METR's 189-task benchmark for measuring autonomous AI capabilities across ML, cybersecurity, software engineering, and general reasoning, used as the primary instrument for tracking the 7-month doubling time of AI task-completion horizons.
a6	Measuring AI Ability to Complete Long Tasks	arXiv / METR	2025-03	Introduces the 50%-task-completion time horizon metric, showing frontier AI models doubled their effective task length every seven months since 2019 and extrapolating this to agent-level software autonomy within a decade.
a7	Crashing Waves vs. Rising Tides: Preliminary Findings on AI and Labor Markets	arXiv	2026-04	Empirically distinguishes whether AI capability gains arrive in abrupt bursts for specific tasks ('crashing waves') or as broad parallel shifts across task duration ('rising tides'), with direct implications for which occupational categories face displacement and when.
a8	Augmenting or Automating Labor? The Effect of AI on Employment and Wages	arXiv	2025-03	Distinguishes automation AI from augmentation AI using US labour-market data (2015–2022), finding displacement effects outweigh productivity gains for low-skilled occupations and that automation exposure negatively affects new-work creation.
a9	Complement or Substitute? How AI Increases the Demand for Human Skills	arXiv	2024-12	Uses 65,000+ job-posting websites (2018–2023) to show AI produces both substitution and complementarity effects on skill demand, with spillover effects reaching workers not directly interfacing with AI systems.
a10	Artificial Intelligence, Automation and Work	NBER / SSRN (MIT Working Paper)	2018-01	Acemoglu and Restrepo's foundational task-based framework showing automation displaces labour from tasks machines can perform, establishing the theoretical scaffold for all subsequent empirical AI labour-market research.
a11	A Task-Based Approach to Inequality	Oxford Open Economics	2024	Acemoglu and Restrepo's synthesis of task-displacement theory applied to AI, arguing that automation reduces labour share and may depress wages unless counterbalanced by creation of new labour-intensive tasks — the key 'this time is different' test.
a12	Job Transformation, Specialization, and the Labor Market Effects of AI	Working paper (NBER-affiliated)	2024	Formal model projecting LLM-induced automation onto heterogeneous workers, finding wages drop up to 35% in the most exposed occupations while rising roughly 4% at moderate exposure, with AI raising returns to social and non-routine manual skills.
a13	Automation and Augmentation: Artificial Intelligence, Robots, and Work	Annual Review of Sociology	2024	Comprehensive literature review confirming displacement effects from task automation persist while noting that automation efficacy does not increase monotonically, and that policy intervention is required to prevent widening inequality.
a14	AI and the Future of Work: A Literature Review	arXiv	2024-08	Synthesises the labour-economics consensus, noting Acemoglu's estimate of only 0.71% TFP gain from AI over ten years contra Goldman Sachs' 7% GDP uplift, illustrating the wide empirical disagreement on net job creation vs displacement.
a15	Platform Competition in Two-Sided Markets	Journal of the European Economic Association	2003	Rochet and Tirole's foundational two-sided-market model explaining how internet platforms court buyers and sellers simultaneously, providing the theoretical basis for understanding how wave-one disintermediation created structural preconditions for wave-two platform economics.
a16	Platform Power in AI: The Evolution of Cloud Infrastructures in the Political Economy of Artificial Intelligence	Internet Policy Review	2024	Empirical analysis of AWS, Azure, and Google Cloud trajectories from 2017 to 2021, tracing how hyperscalers operationalise infrastructural power through the cloud-to-AI dependency chain.
a17	Big AI: Cloud Infrastructure Dependence and the Industrialisation of Artificial Intelligence	Big Data & Society	2024	Documents how Amazon, Microsoft, and Google use cloud credits, APIs, and technical support to enrol AI startups into their infrastructure stacks, making hyperscaler lock-in the primary mechanism of value capture from the AI wave.
a18	Hype, Sustainability, and the Price of the Bigger-is-Better Paradigm in AI	arXiv	2024-09	Documents the circular capital structure where three hyperscalers contributed two-thirds of the $27 billion raised by AI startups in 2023, and applies Jevons' Paradox to explain why efficiency gains from scaling increase rather than reduce overall resource consumption.
a19	Scaling Laws of Synthetic Data for Language Models	arXiv	2025-03	Examines whether synthetic data can substitute for organic web-scraped corpora as natural language supply approaches saturation, testing whether scaling laws hold when training tokens are model-generated rather than human-produced.
a20	Web Crawler Restrictions, AI Training Datasets and Political Biases	arXiv	2025-10	Shows that increased robots.txt restrictions by moderate news sources push AI training corpora toward hyperpartisan content, linking data-access restrictions to downstream bias in the political composition of training sets.
a21	Somesite I Used to Crawl: Awareness, Agency and Efficacy in Protecting Content Creators from AI Crawlers	ACM Internet Measurement Conference 2025	2025	Active and passive measurement study of AI crawler behaviour across popular sites, finding that 50–70% of website traffic is now automated and that crawlers do not reliably respect robots.txt directives.
a22	Generative AI and the Future of the Digital Commons	arXiv	2025-08	Frames the foreclosure of open web data through an Ostrom commons lens, documenting that 65% of Wikimedia's most expensive traffic now originates from AI crawlers and analysing governance frameworks for distinguishing search, archival, and training crawlers.
a23	A Critical Analysis of the Largest Source for Generative AI Training Data: Common Crawl	ACM FAccT 2024	2024-06	Provides a structural critique of Common Crawl as AI training infrastructure, examining governance, funding ties to AI labs, and the copyright and accountability gaps embedded in its data pipeline.
a24	Displacement or Complementarity? The Labor Market Effects of Generative AI	Harvard Business School Working Paper	2024	Finds a 24% decrease in generative AI-exposed skills per firm per quarter in highly automatable jobs post-ChatGPT, against a 15% increase in augmentation-exposed jobs, providing the most granular job-posting evidence on the dual displacement-complementarity dynamic.
a25	Artificial Intelligence, Domain AI Readiness, and Firm Productivity	arXiv	2025-08	Examines why many firms fail to realise AI productivity returns despite heavy investment, finding that domain AI readiness — quality of external academic and data infrastructure — is a stronger predictor than internal technical capability alone.

Blogs & Independent Thinkers

ID	Title	Outlet	Date	Significance
b1	Aggregation Theory	Stratechery (Ben Thompson)	2015-07	Foundational framing of how internet-era zero marginal distribution costs restructured industry power from supply control to demand aggregation, directly explaining the economic logic of wave one's disintermediation.
b2	Enterprise Philosophy and the First Wave of AI	Stratechery (Ben Thompson)	2024-08	Traces technology waves from mainframe digitisation through SaaS to AI, examining how enterprise adoption patterns and job displacement repeat structurally across each transition, with Salesforce as the hinge between wave one and two.
b3	Agents Over Bubbles	Stratechery (Ben Thompson)	2026-03	Argues that agentic AI is not a bubble because each agent multiplies compute demand rather than substituting for it, and that the economic imperative to deploy agents will drive both workforce restructuring and hyperscaler capex compounding.
b4	AI Integration and Modularization	Stratechery (Ben Thompson)	2024-06	Applies Christensen and Coase integration logic to the AI stack, showing why value accumulates at integrated layers rather than modular ones and explaining why Google's chip-to-model vertical stack differs structurally from AWS's marketplace approach.
b5	AI and the Big Five	Stratechery (Ben Thompson)	2023-01	Maps which incumbent hyperscalers are positioned to capture AI value and which are threatened, drawing the explicit parallel between cloud infrastructure incumbency and AI layer incumbency.
b6	AI Promise and Chip Precariousness	Stratechery (Ben Thompson)	2025-04	Traces the semiconductor dependency chain from Silicon Valley's founding through TSMC to current AI compute, arguing that wave three's binding constraint is geopolitical control of chip fabrication rather than software.
b7	The End of Aggregation Theory and AI Economics (homepage synthesis)	Stratechery (Ben Thompson)	2026-03	Synthesises the claim that AI reintroduces marginal costs and ends the zero-marginal-cost era that Aggregation Theory described, marking the structural boundary between wave two and wave three economics.
b8	Stuff we figured out about AI in 2023	Simon Willison's Weblog	2023-12	First-hand practitioner account of the LLM breakthrough year, documenting the open-web training corpus as the substrate of wave-three capability and flagging the epistemic uncertainty around what models actually learn.
b9	Things we learned about LLMs in 2024	Simon Willison's Weblog	2024-12	Annual review documenting the 100x inference price drop, the proliferation of GPT-4-class models to 18 labs, and the transition toward agentic patterns, providing empirical evidence for the compounding cost-reduction dynamic of wave three.
b10	2025: The year in LLMs	Simon Willison's Newsletter (Substack)	2026-01	Introduces the Jevons paradox framing for AI knowledge work — cheaper cognition generates more demand for cognitive tasks rather than less — directly engaging the compounding-versus-displacement debate.
b11	What if LLMs are mostly crystallized intelligence?	LessWrong	2025	Argues that frontier model capability growth is bottlenecked by domain-specific data quality and volume, with general-purpose internet data estimated to run dry by 2026–2032, making wave-one open-web content a finite and depletable input.
b12	The next wave of model improvements will be due to data quality	LessWrong	2025-06	Identifies real-world deployment feedback (from Operator and Codex usage signals) as the next load-bearing training data source, framing the shift from static open-web corpora to dynamic synthetic-and-interaction data as a structural transition.
b13	The New AI Infrastructure Stack	Medium (Devansh / Machine Learning Made Simple)	2025-06	Characterises the ASIC–CXL–Optical I/O triad as an interlocking dependency chain where adopting one layer forces adoption of the next, providing a concrete hardware-level illustration of wave-internal compounding.
b14	A Simple Explainer of Acemoglu's Simple Macroeconomics of AI	Causal Inference (Substack)	2025-04	Unpacks Acemoglu's NBER 2024 model projecting TFP gains of 0.55–0.71% annually from AI under baseline assumptions, grounding the productivity-paradox debate in a tractable framework and cross-referencing Autor's task model.
b15	The Future of Employment in the Age of Artificial Intelligence	Substack (José Luis Chávez Calva)	2025-04	Synthesises Acemoglu–Restrepo displacement, productivity, and new-task-creation effects against empirical vacancy data showing kinks in software employment post-2022, testing whether wave three is following wave-two job-creation patterns.
b16	The Transition Is The Crisis: A DEEP Dive on AI, Jobs and The Future Of Work Over the Next 5 Years	Great Leadership (Substack)	2026-02	Applies the Acemoglu–Restrepo task-based model to current AI layoff data, documenting that routine cognitive task displacement is already net negative while creative and strategic roles show net job growth, testing the 'this time is different' thesis directly.
b17	Daron Acemoglu on AI and Jobs	Center for Humane Technology (Substack)	2024-05	Acemoglu argues that automation since the 1980s has created two structural inequality tiers — capital vs labour, and task-commanding vs task-displaced workers — framing the AI wave as an amplifier of an already-running dynamic.
b18	Acemoglu and Johnson on the Past and Future of Work	The One Percent Rule (Substack)	2024-12	Reviews 'Power and Progress' and its core argument that technological benefit distribution depends on institutional power structures, not on the technology itself, providing the historical context for the externalised-harvest framing.
b19	Beyond AI Apocalypse as '47% of Jobs at Risk'	Tom Bewick (Substack)	2026-02	Critiques Frey–Osborne and engages Gans–Goldfarb 2026 O-ring automation model, arguing that automating easy tasks concentrates human effort on harder bottleneck tasks, shifting the skill premium rather than eliminating it.
b20	The AI Disintermediation Panic is Unfounded	Playing FTSE (Substack)	2026-01	Argues the market is mispricing incumbents by conflating 'AI creates new competition' with 'AI destroys incumbent value overnight,' drawing a direct parallel to earlier waves where incumbents adapted rather than collapsed.
b21	The Death of AI Extraction: Architecting Your Sovereign Exit	Augmented Mind (Substack)	2025-05	Frames the AI platform stack as a repeat of the wave-two platform dependency playbook — cheap access, capture dependency, raise prices — arguing that AI labs are converting wave-one open-web content into a proprietary subscription layer.
b22	The 'vast uncertainty' of AI and jobs — David Autor	The Next Wave Futures (WordPress / Andrew Curry)	2024-02	Synthesises Autor's 2024 NBER paper showing 60% of US employment is in categories invented post-1940, and documents his explicit uncertainty about whether wave three will follow the same new-task-creation pattern.
b23	AI 2027: What Superintelligence Looks Like	LessWrong	2025-04	Detailed scenario analysis of synthetic training data loops and agent-generated research, examining whether wave three can bootstrap its own training data supply and break free of wave-one corpus dependency.
b24	Is the Internet Different? (critique of Aggregation Theory)	Stratechery (Ben Thompson)	2020-10	Documents the academic and legal pushback on Aggregation Theory, including Tim Wu's critique, providing a rigorous counterpoint that grounds the internet-disintermediation claim in contested rather than settled economics.
b25	Inside the AI Buildout Wave: How Infrastructure Is Becoming the New Battleground	Defiance ETFs (Substack)	2025-11	Documents the Magnificent Seven's $21.1 trillion share of a $60 trillion S&P 500 market cap as evidence of how hyperscaler concentration compounds across technology waves, and identifies power, chip fabrication, and cooling as the next binding physical constraints.

VC & Analyst Reports

ID	Title	Outlet	Date	Significance
v1	Why Software Is Eating the World	Andreessen Horowitz (a16z)	2011-08	The foundational VC thesis articulating the internet-to-software wave, tracing how falling infrastructure costs (from $150,000/month in 2000 to $1,500/month by 2011) made the cloud era possible, and explicitly predicting that software would disintermediate established industries — a direct precursor to the 'AI is eating software' framing that followed.
v2	AI's $200B Question (Follow the GPUs)	Sequoia Capital	2023-09	David Cahn's first quantitative framing of the AI infrastructure revenue gap — requiring $200B in annual end-user revenue to justify then-current GPU capex — introduced the 'follow the GPUs' investment thesis and raised the first structural question about whether the AI wave would compound value or incinerate capital.
v3	AI's $600B Question	Sequoia Capital	2024-06	Cahn's updated analysis tripled the required revenue figure to $600B annually as Nvidia's run-rate revenue surged, exposing a structural $500B gap between AI infrastructure investment and demonstrated end-user returns — the most-cited quantitative challenge to the compounding-value thesis.
v4	AI in 2024: From Big Bang to Primordial Soup	Sequoia Capital	2024-01	Sequoia's annual AI outlook named the post-ChatGPT frenzy a 'primordial soup' phase, arguing AI requires deeper exploration than SaaS wave substitution and explicitly contrasting AI as a 'revolution' against cloud SaaS as an 'evolution' from on-premise software.
v5	AI in 2025: Building Blocks Firmly in Place	Sequoia Capital	2024-12	Sequoia's 2025 outlook maps the consolidation of frontier model competition to five 'finalists' (Microsoft/OpenAI, Amazon/Anthropic, Google, Meta, xAI), frames compute scaling as the next binding constraint, and places the AI wave in relation to prior infrastructure build-outs.
v6	Marc Andreessen made a dire software prediction 15 years ago. Now it's happening in a way nobody imagined	Fortune / Morgan Stanley	2026-02	A retrospective audit of Andreessen's 2011 thesis confirming that software did eat retail, media and telecoms as predicted, but that AI is now eating the software layer itself — with Morgan Stanley quantifying unstructured data (over 80% of enterprise information) as the new automation frontier displacing SaaS headcount.
v7	Bain Technology Report 2025: $2 Trillion in New Revenue Needed to Fund AI's Scaling Trend	Bain & Company	2025-09	Bain's sixth annual technology report quantifies the AI funding gap as $2T in required annual revenue by 2030 against a global AI compute demand reaching 200 gigawatts, and introduces a four-level agentic maturity framework (information retrieval through multi-agent constellations) as the structural map for the current wave.
v8	AI's Trillion-Dollar Opportunity (Bain Technology Report 2024)	Bain & Company	2024-01	Bain's 2024 report places hyperscalers as the dominant first movers in the AI wave, projecting data centre scale growing from 100 megawatts to gigawatts, and frames the cloud infrastructure wave as the direct load-bearing prerequisite for frontier model deployment.
v9	State of the Art of Agentic AI Transformation (Bain Technology Report 2025)	Bain & Company	2025-09	Bain's chapter-level analysis of agentic AI maturity identifies compounding errors in multi-step tasks, lack of communication standards, and data silos as the current binding constraints on wave three — directly relevant to the question of what stops compounding from continuing.
v10	McKinsey: Where AI Will Create Value — and Where It Won't	McKinsey Global Institute	2025-04	McKinsey's three-wave model (productivity gains, differentiation, transaction-cost reduction) maps AI's compounding economic logic and argues small early advantages in data quality and relevance will compound into 'disproportionate demand' concentration — directly engaging the 'compounding vs harvest' debate.
v11	The Economic Potential of Generative AI: The Next Productivity Frontier	McKinsey Global Institute	2023-06	McKinsey's headline market-sizing report places generative AI at $2.6–$4.4 trillion in annual value across 63 use cases, with customer operations, marketing and sales, software engineering, and R&D as the leading categories — providing the quantitative floor for wave-three economic claims.
v12	The State of AI in 2025: Agents, Innovation, and Transformation	McKinsey Global Institute	2025-11	McKinsey's annual survey finds 88% of organisations using AI in at least one function (up from 78% in 2024) but only 6% qualifying as high performers with over 5% EBIT impact, while identifying demand for data engineers, ML engineers, prompt engineers and AI ethics specialists as the emerging role categories of wave three.
v13	CB Insights State of AI 2025 Report	CB Insights	2026-02	Full-year 2025 AI venture data: over $200B in AI funding with LLM developers (OpenAI, Anthropic, xAI) capturing 41% of investment, AI M&A running at 1.5x 2024 levels, and the frontier model race consolidating into clear 'haves' and 'have-nots' — the clearest quantitative map of wave-three capital concentration.
v14	CB Insights State of Venture 2024 Report	CB Insights	2025-01	Documents the milestone year when AI represented 37% of venture funding and 17% of deals — both all-time highs — with all top five venture deals going to AI infrastructure players, marking the shift from software-wave investment patterns to AI-wave capital concentration.
v15	CB Insights State of Venture 2025 Report	CB Insights	2026-02	CB Insights' full-year 2025 report records total venture funding at $469B (the highest since 2022), AI accounting for 48% of all funding, and the top six rounds (totalling $111B) all going to AI companies — quantifying the winner-take-most dynamic the compounding thesis predicts.
v16	CB Insights State of Venture Q3'25 Report	CB Insights	2025-12	Captures AI exceeding 50% of total venture funding for the first time and notes energy (nuclear, fusion) attracting record investment as hyperscaler power demand becomes the binding constraint — directly evidencing the 'next substrate' question.
v17	Gartner Predicts 40% of Enterprise Apps Will Feature Task-Specific AI Agents by 2026	Gartner	2025-08	Gartner's adoption-curve forecast for agentic AI — from less than 5% of enterprise apps in 2025 to 40% by 2026 and potentially 30% of enterprise application software revenue ($450B+) by 2035 — provides the clearest quantitative adoption-curve framing for wave-three diffusion.
v18	Gartner: AI Agents Will Intermediate More Than $15 Trillion in B2B Purchases by 2028	Gartner / Digital Commerce 360	2025-11	Gartner's most expansive market-sizing claim — 90% of B2B purchases intermediated by AI agents by 2028, channelling over $15 trillion — frames the disintermediation of wave-one distribution patterns through wave-three agentic systems, closing the loop on the three-wave dependency chain.
v19	Gartner Says Agentic AI Supply Exceeds Demand, Market Correction Looms	Gartner	2025-10	Gartner's counter-cyclical warning that agentic AI supply already exceeds demand and a market correction is likely provides sceptical ballast to the compounding-value thesis, drawing historical parallels to dot-com, energy and telecoms corrections.
v20	Gartner Forecasts Supply Chain Management Software with Agentic AI Will Grow to $53 Billion by 2030	Gartner	2026-04	The most recent Gartner vertical-specific forecast, tracking agentic AI in supply chain from under $2B in 2025 to $53B by 2030 (60% enterprise adoption), also identifies data-readiness and workforce AI-literacy as the binding constraints on deployment speed.
v21	Gartner Top Strategic Technology Trends for 2025: Agentic AI	Gartner	2024-10	Gartner's technology radar placement of agentic AI as the top 2025 strategic trend, framing it as a 'goal-driven digital workforce' and projecting that by 2028 at least 15% of day-to-day work decisions will be made autonomously — the clearest Gartner framing of wave-three as a labour-market event.
v22	CB Insights State of Venture Q1'25 Report	CB Insights	2025-05	Documents the Q1 2025 shift in AI dealmaking from infrastructure-dominated investment toward vertical application-layer platforms, with 63% of organisations placing significant importance on AI agents — marking the transition from wave-three infrastructure build-out to application-layer value capture.
v23	The Crunchbase Unicorn Board: Rising Investors Behind the New Unicorn Class	Crunchbase	2026-03	Documents 187 new unicorns in 2025 (up 61% year-on-year), with AI-native companies accounting for 25% of the total and Sequoia and a16z dominating early-stage backing — providing the most recent empirical evidence on the rate of new-category creation in wave three.
v24	Bain Technology Report 2025 (Full PDF): AI Leaders Are Extending Their Edge	Bain & Company	2025-09	Bain's full sixth annual report, documenting that AI leaders achieved 10–25% EBITDA gains in 2023–24 while laggards fell further behind, and examining how tech giants are competing at every layer — infrastructure, models, platforms, applications — to capture disproportionate value.
v25	Foundation Capital: The AI Hype — $600B Question or $4.6T Opportunity?	Foundation Capital	2024-11	Directly rebuts Sequoia's Cahn thesis by expanding the addressable market to include labour costs ($2.3T in sales, marketing, software engineering and HR) plus outsourced IT and business-process services ($2.3T per Gartner), arguing the AI wave is addressing a $4.6T opportunity rather than competing for the existing software market.