AI Regulation and the Regulated Enterprise - Trajectory to 2030

The trajectory of AI regulation across the EU AI Act, the UK's pro-innovation and contextual approach, and the financial-services regulatory regime (FCA, PRA, Bank of England) from January 2023 to May 2026, including the FCA Mills Review, GPAI obligations, model-risk and accountability rules, and what they demand of technology leadership in regulated firms

Gemini 2.5 Pro
frontier
academic
vc

Synthesised 2026-05-19

Narrative

The period from early 2023 to mid-2026 saw rapid advancements and releases from frontier AI labs, alongside increasing efforts in external evaluation and safety. OpenAI continued to iterate on its GPT series, releasing GPT-4 in 2023 with enhanced capabilities and later introducing GPT-4o in August 2024, followed by a series of GPT-5.x models in 2025 and 2026, including GPT-5.4 as a capable frontier model for professional work and GPT-5.5 Instant for improved conversational AI. Anthropic also progressed its Claude family, launching Claude 3 models (Haiku, Sonnet, Opus) in March 2024 with multimodal input and expanded context windows, culminating in the release of Claude 4 (Opus 4, Sonnet 4) in May 2025, which brought significant improvements in coding, reasoning, and autonomous task execution.

Google DeepMind focused on its Gemini series, releasing Gemini Nano and Pro in December 2023 and Gemini Ultra 1.0 in February 2024. By late 2025 and early 2026, Gemini 3 emerged with advanced multimodal understanding and reasoning, with Gemini Deep Think demonstrating the ability to solve professional research problems in mathematics and science. Meta AI contributed to the open-source landscape with the LLaMA series, introducing LLaMA 3 in July 2024, which offered multilinguality, coding, reasoning, and tool usage, and later LLaMA 4 in April 2025, featuring a sparse Mixture-of-Experts architecture and extended context windows.

Mistral AI rapidly expanded its offerings, with Mistral Large 24.11 and Codestral 25.01 announced in late 2024 and early 2025, focusing on reasoning, coding, and long context. By December 2025, Mistral 3 was released, including a sparse mixture-of-experts model and smaller dense models, emphasising open-source and efficient AI. xAI, founded by Elon Musk in July 2023, launched its Grok series, with Grok 3 released in February 2025, trained on the Colossus supercomputer, and Grok 4.20 Reasoning and Non-Reasoning variants appearing in March 2026, notable for real-time data access and a distinctive tone.

External evaluations from METR (Model Evaluation & Threat Research) provided crucial independent assessments of these frontier models. METR published reports on GPT-4 and Claude in March 2023, and continued to evaluate subsequent releases such as GPT-4o, Claude 3.5 Sonnet, GPT-4.5, Claude 3.7, GPT-5, and GPT-5.1-Codex-Max through 2024 and 2025. In January 2026, METR released Time Horizon 1.1, an updated methodology for measuring AI autonomous capabilities, which indicated an accelerated rate of progress in AI capabilities since 2023. The Frontier Model Forum, established in July 2023 by major labs, also published research on AI safety frameworks and risk management.

Sources

ID	Title	Outlet	Date	Significance
t1	GPT-4 and Claude	METR	2023-03	This early METR report provides a baseline evaluation of the autonomous capabilities of foundational models like GPT-4 and Claude, highlighting their performance in early 2023.
t2	LLaMA: Open and Efficient Foundation Language Models	Meta Research	2023-02-24	This paper introduces Meta's LLaMA models, demonstrating that state-of-the-art performance can be achieved with publicly available datasets and releasing models to the research community.
t3	Introducing the Frontier Model Forum	Frontier Model Forum	2023-07-26	This announcement marks the formation of the Frontier Model Forum by Anthropic, Google, Microsoft, and OpenAI, signalling a collaborative effort towards AI safety and responsible development.
t4	The Llama 3 Herd of Models	AI at Meta	2024-07-23	This paper details the Llama 3 family of models, including a 405B parameter model with a 128K token context window, demonstrating Meta's advancements in multilinguality, coding, reasoning, and tool usage.
t5	GPT-4o	METR	2024-08-07	METR's evaluation of GPT-4o provides an independent assessment of OpenAI's model capabilities, contributing to the understanding of its performance and risks.
t6	o1-preview	METR	2024-09-12	This METR evaluation of OpenAI's o1-preview model offers insights into the performance of a specific model variant, particularly useful for tracking incremental advancements.
t7	Claude 3.5 Sonnet (original)	METR	2024-10-30	METR's evaluation of Claude 3.5 Sonnet provides an external benchmark for Anthropic's model, detailing its capabilities and performance at the time of release.
t8	[Announcing Mistral AI's Mistral Large 24.11 and Codestral 25.01 models on Vertex AI	Google Cloud Blog](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQHo4kdmEvGLWyadZMiePJVfemSreDTzW1usOaNtoNK39QcahkwZnGeBNfoHHzS1Ng7YlDQQSz-ZaaMKE_VBBFZLqrxhIrxsKePrb9QoRInBdQHydt4t8cq5T5rqAZ5v6gbwd-fZdVQ0htTMJYtOcLs_zjLCjDFvuYLvbD-N_-fe0pxt9ETNmytLTUPZ9Hj8ROF7v-Qyl7X1IUdQzBpQpZtxR351_Bg=)	Google Cloud Blog	2025-01-14
t9	Claude 3.5 Sonnet and o1	METR	2025-01-31	METR's evaluation of Claude 3.5 Sonnet and OpenAI's o1 provides a comparative assessment of these models, offering insights into their relative strengths and weaknesses.
t10	GPT-4.5	METR	2025-02-27	This METR report on GPT-4.5 provides an independent evaluation of OpenAI's model, detailing its performance and contributing to the understanding of its capabilities.
t11	Claude 3.7	METR	2025-04-04	METR's evaluation of Claude 3.7 offers an external assessment of Anthropic's model, providing data on its autonomous capabilities and potential risks.
t12	OpenAI o3 and o4-mini	METR	2025-04-16	This METR report evaluates OpenAI's o3 and o4-mini models, providing insights into their performance and suitability for various tasks.
t13	[Anthropic Claude 4: Evolution of a Large Language Model	IntuitionLabs](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQF_SSqw3UbRLyoZr4amxmt5quDeJ678dS16cI_H6NJUVUaRznOIzZqDSaBw1ql_WHUQB-T8u__dYmQ_b7sCI7CEshKuataS1TqS6Z83qGFeNXu1qwtqs9DB2R76jthiVt42nr45ybcNSbyNLGnNUwhyELHwX-a_N9gxA7Zr0w==)	IntuitionLabs	2025-05-22
t14	GPT-5	METR	2025-08-07	METR's evaluation of GPT-5 provides a critical, independent assessment of OpenAI's next-generation model, focusing on its autonomous capabilities and potential risks.
t15	GPT-5.1-Codex-Max	METR	2025-11-19	This METR report specifically evaluates GPT-5.1-Codex-Max, offering insights into its coding capabilities and potential for catastrophic risks like AI self-improvement.
t16	Gemini 3 for Technical Documentation: Industry Disruption Predictions and Adoption Roadmap 2025-11-20 - Sparkco	Sparkco	2025-11-20	This report highlights Gemini 3's advancements in multimodal alignment and context window, positioning it as a competitor to GPT-5 in image-text integration and technical documentation.
t17	Introducing Mistral 3	Mistral AI	2025-12-02	Mistral AI's announcement of Mistral 3, including a sparse mixture-of-experts model and smaller dense models, signifies their commitment to open-source, high-performing, and efficient AI.
t18	[The state of open source AI models in 2025	Red Hat Developer](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQHavMTaq3zXFbb1Lp5wBe5OZbuNphB8orvax8zSUHx6BARnvwV63y9jGm7D2tAwR7chtCv2Zm9fcew-0w1vk5msr9KJOwND5PMtZJHYjrEWNbHaeUKnY9TqRNZyTy2DnXxIUVXrnWjAtOPIQqxo6cNARt-s0_YhLwgu37EQ89-v_FjONqmiZmH8_afhokk=)	Red Hat Developer	2026-01-07
t19	What's in Grok? (Independent Grok 5 Paper) - LifeArchitect.ai	LifeArchitect.ai	2026-01-21	This independent report provides a quantitative analysis of xAI's Grok models, detailing their rapid evolution from a 33B parameter prototype to frontier models with trillions of parameters, despite xAI's secrecy.
t20	Time Horizon 1.1 - METR	METR	2026-01-29	METR's release of Time Horizon 1.1 updates their methodology for measuring AI autonomous capabilities, indicating an increased rate of progress in AI capabilities since 2023.
t21	Gemini Deep Think: Redefining the Future of Scientific Research - Google DeepMind	Google DeepMind	2026-02-11	This announcement details Gemini Deep Think's ability to solve professional research problems in mathematics, physics, and computer science, demonstrating advanced reasoning capabilities.
t22	Anthropic's Transparency Hub	Anthropic	2026-02-20	Anthropic's Transparency Hub provides detailed information on Claude models, including Opus 4.7, highlighting their multimodal capabilities, knowledge cut-off dates, and development resources.
t23	Evaluating AI Providers' Frontier AI Safety Frameworks - arXiv	arXiv	2026-03-26	This arXiv paper assesses the frontier AI safety frameworks of twelve AI companies, revealing that many aspects are missing or under-specified, limiting their effectiveness as accountability mechanisms.
t24	Grok AI: The Complete Guide to Elon Musk's Chatbot (2026)	LifeArchitect.ai	2026-03-29	This guide provides a comprehensive overview of xAI's Grok, detailing its unique characteristics like real-time access to X/Twitter data, irreverent tone, and advanced features such as DeepSearch and Big Brain Mode.
t25	[Latest news	Mistral AI](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AUZIYQEsEvTHdoLBvogxq55YXnfzaiEUlZSRE4l4zry3FSXr6mW2j0mjKzXeApQaH-eWUBoFohEi0BXLhK3DMIngbmn1_kBIa6Q_QRvoeDVRjnlx9-XB8oOBgGUsUmttqUHb08cCFNbY)	Mistral AI	2026-04-29