AI in Weather and Climate Prediction

AI in weather and climate prediction across the 2015 to June 2026 machine-learning era, with historical context from mid-twentieth-century numerical weather prediction and Lorenz's chaos theory: the shift from physics-based NWP and statistical post-processing (MOS) to data-driven models (GraphCast, GenCast, Pangu-Weather, FourCastNet, Aurora, NeuralGCM, ECMWF AIFS), how forecasters at ECMWF, NOAA, and the Met Office have operationalised them, measured accuracy versus the IFS, and the predictability limits imposed by chaos, the Lorenz attractor, and the butterfly effect.

Claude Opus 4.8
financial
frontier
academic
vc
blogs

Synthesised 2026-06-26

Narrative

The dominant Substack voice for technically fluent independent commentary on AI weather prediction is Karolina Stanisławska's AI Weather Hub, which bridges the gap between ML engineers and practising meteorologists. Her January 2025 piece on GenCast explains the model's conditional diffusion architecture with unusual clarity, showing how the ensemble emerges from random noise seeding rather than from perturbed initial conditions as in classical NWP, and situating the butterfly effect squarely within that design choice. Phil Siarri's Philaverse Substack (July 2025) and the CLAI Ventures Substack (January 2025) offer more investor-oriented but technically specific surveys of the operational landscape, noting that ECMWF, the UK Met Office, NOAA, and several Asian meteorological agencies are at different stages of evaluation or deployment. The Open-Meteo Substack by Patrick Zippenfenig provides practitioner commentary on integrating GraphCast into live APIs, including the mismatch between ERA5 training data and GFS initialisation that produces systematic inconsistencies.

Two structural tensions dominate the independent commentary. First, every major ML weather model currently depends on classical NWP for its input state: data assimilation, which combines observations into a coherent atmospheric analysis, remains a physics-based step upstream of even the most capable data-driven forecast. Several independent writers note this clearly, and it is confirmed by ECMWF's own AIFS blog series, which tracks accuracy against the IFS and openly catalogues blurriness in surface fields and spurious negative precipitation values that were not corrected until the August 2025 AIFS v1.1.0 release. The ECMWF AIFS accuracy-versus-activity blog post from December 2024 provides one of the few honest operational scorecards, showing that forecast activity does not drift with lead time for AIFS, unlike some third-party models, but that stratospheric skill remains limited. Second, commentators diverge on whether ML models have altered the predictability horizon or merely approached it more cheaply. Research reviewed in multiple independent pieces suggests AI models may actually attenuate small-perturbation growth, meaning they understate the butterfly effect rather than overcoming it; the FuXi ensemble study in npj Climate and Atmospheric Science (2025) is the most direct evidence for this.

The physical-consistency problem receives sustained independent attention. Massimo Bonavita's 2024 Geophysical Research Letters paper, widely cited in practitioner blogs, documents that ML forecast energy spectra differ from those of reanalysis and NWP models, producing overly smooth forecasts. The 2025 Journal of Advances in Modeling Earth Systems paper by Sha and colleagues demonstrates that adding global mass and energy conservation schemes to FuXi reduces forecast error and corrects systematic biases such as excess light rain. ECMWF's own September 2025 AIFS update paper formalises this finding for AIFS, introducing output bounding layers. These are not peripheral technical details: they are the crux of whether ML models can be trusted for extreme events, where physical plausibility matters most and where the training distribution is thinnest.

The outlook framing in independent commentary is more cautious than the vendor announcements. The articsledge.com analysis (May 2026) notes that AI models remain downstream of classical assimilation and that climate-change-driven distribution shift poses a structural challenge: models trained on 1979 to 2017 ERA5 data are being asked to forecast an atmosphere that is systematically warmer in ways that have no close historical analogue. The GeoAI Unpacked Substack notes that many national weather services run legacy Fortran code, raising a practical question of whether ML models will complement or displace existing infrastructure. The Science Advances paper from April 2026 on physics-based models outperforming AI for record-breaking extremes is the sharpest independent counter-evidence to the headline accuracy claims, and it sits uneasily alongside the vendor benchmarks.

Sources

ID	Title	Outlet	Date	Significance
b1	[AI Weather Hub	Karolina Stanisławska	Substack](https://aiweatherhub.substack.com/)	AI Weather Hub (Substack)
b2	GenCast – AI meets ensemble forecasting	AI Weather Hub (Substack)	2025-01	Explains GenCast's conditional diffusion ensemble architecture and its relationship to the butterfly effect, cross-referencing the Nature paper with accessible independent analysis.
b3	Why AI Weather? – AI Weather Hub	AI Weather Hub (Substack)	2024-10	Sets out the independent case for why ML weather forecasting is a distinct paradigm shift from statistical post-processing, contextualising it against the ChatGPT moment in AI.
b4	AI and the future of weather forecasting – Phil Siarri	Philaverse (Substack)	2025-07	Named-author survey of operational deployment status at ECMWF, Met Office, NOAA, and Asian agencies, with model-by-model architecture summaries and citation of primary literature.
b5	A Breakthrough Year for AI in Weather Forecasting: Insights and Opportunities	CLAI Ventures (Substack)	2025-01	Investor-practitioner analysis covering NeuralGCM and GenCast with specific benchmark figures, and noting the resolution gap between GenCast at 0.25° and ENS at 0.1°.
b6	Exploring GraphCast – Open-Meteo	Open-Meteo (Substack)	2024-04	Practitioner API developer commentary on the ERA5-training versus GFS-initialisation mismatch in production GraphCast, and on NOAA's retraining effort to resolve it.
b7	AI for weather forecasting – GeoAI Unpacked #2	GeoAI Unpacked (Substack)	2024-10	Independent analysis highlighting the blurriness problem in AIFS versus IFS outputs, legacy Fortran infrastructure at national weather services, and the tension between accuracy metrics and visual realism.
b8	So, which weather forecast is the best? – ActuallyWeather	ActuallyWeather (Substack)	2025-12	Independent real-world verification of ECMWF, GFS, HRRR, and GraphCast using location-specific skill scores, providing evidence outside vendor-reported benchmarks.
b9	AI Weather Forecasting 2026: Models, Accuracy & Results	ArticlEdge	2026-05	Comprehensive independent synthesis citing primary benchmarks and noting the distribution-shift risk as AI models trained on 1979–2017 ERA5 are deployed into a systematically warmer 2026 atmosphere.
b10	AIFS: a new ECMWF forecasting system	ECMWF Newsletter	2024-01	Primary institutional source documenting ECMWF's architectural choice of GNNs for AIFS, its ERA5 training regime, and its explicit comparison to GraphCast and Pangu-Weather.
b11	Accuracy versus activity – ECMWF AIFS Blog	ECMWF	2024-12	Operational scorecard from ECMWF comparing AIFS, GraphCast, Pangu-Weather, and Aurora on RMSE and forecast activity metrics, including Aurora's instability beyond day 7.
b12	Anemoi: a new framework for weather forecasting based on machine learning	ECMWF	2024-10	Documents ECMWF's open-source Anemoi framework enabling national meteorological services to build regional ML models on the same architecture as AIFS.
b13	GraphCast: AI model for faster and more accurate global weather forecasting	Google DeepMind Blog	2023-11	Primary vendor announcement for GraphCast containing the headline claim of 90% superiority over HRES across 1,380 targets, and the Hurricane Lee nine-day landfall prediction case study.
b14	GenCast predicts weather and the risks of extreme conditions with state-of-the-art accuracy	Google DeepMind Blog	2024-12	Primary vendor source for GenCast's diffusion-based ensemble design and the 97.2% superiority over ECMWF ENS claim, providing the vendor-side data that independent commentary cross-references.
b15	Probabilistic weather forecasting with machine learning (GenCast)	Nature	2024-12	Peer-reviewed primary source for GenCast, confirming that prior ML ensemble methods failed on blurring and that GenCast is the first to outperform ENS at 0.25° resolution.
b16	On Some Limitations of Current Machine Learning Weather Prediction Models	Geophysical Research Letters	2024-06	Massimo Bonavita's widely cited analysis documenting that ML forecast energy spectra differ from NWP and reanalysis, producing blurriness that standard RMSE metrics do not penalise.
b17	Improving AI Weather Prediction Models Using Global Mass and Energy Conservation Schemes	Journal of Advances in Modeling Earth Systems	2025-11	Demonstrates that adding conservation-law constraints to FuXi reduces forecast error and corrects the drizzle bias, directly addressing the physical-consistency critique.
b18	An update to ECMWF's machine-learned weather forecast model AIFS	arXiv / ECMWF	2025-09	Documents the August 2025 AIFS v1.1.0 update incorporating output bounding layers to prevent physically implausible outputs such as negative precipitation.
b19	Evaluation of five global AI models for predicting weather in Eastern Asia and Western Pacific	npj Climate and Atmospheric Science	2024-09	Independent homogeneous comparison of five ML models under identical ERA5 initial conditions, finding FengWu leads for typhoon track prediction and that a multi-model ensemble rivals the best single model.
b20	A fast physics-based perturbation generator of machine learning weather model for efficient ensemble forecasts of tropical cyclone track	npj Climate and Atmospheric Science	2025-03	Finds that FuXi attenuates small-perturbation growth compared to IFS, suggesting AI weather models may understate the butterfly effect rather than overcome it.
b21	An Observations-focused assessment of Global AI Weather Prediction Models During the South Asian Monsoon	arXiv	2025-09	Evaluates seven AI models against 458 weather stations and satellite data, finding AIFS leads on most metrics but all models show substantially higher errors against ground observations than against reanalysis.
b22	Validating Deep Learning Weather Forecast Models on Recent High-Impact Extreme Events	Artificial Intelligence for the Earth Systems (AMS)	2025-01	Case studies on the 2021 Pacific Northwest heatwave and 2021 winter storm find that ML models match HRES locally but underperform aggregated, and lack variables needed for humid heatwave health-risk assessment.
b23	Physics-based models outperform AI weather forecasts of record-breaking extremes	Science Advances	2026-04	The most direct counter-evidence to headline ML accuracy claims, showing physics-based NWP retains an advantage for truly unprecedented extreme events outside training distribution.
b24	AI-Driven Weather Forecasts to Accelerate Climate Change Attribution of Heatwaves	Earth's Future (AGU)	2025-08	Demonstrates a new application of AI weather models for near-real-time attribution of heatwaves to anthropogenic climate change, showing NeuralGCM's hybrid physics advantage for SST-dependent events.
b25	Weather forecasting in a changing climate: the rise of AI and Machine learning?	ScienceDirect (journal article)	2026-05	Most recent practitioner review, documenting AIFS superiority for tropical cyclone track prediction and the open Anemoi framework, with a frank assessment of remaining resolution and coupling gaps.