2026-04-14 Ars Technica

UK Government Mythos AI Tests Cut Cybersecurity Hype, Identify Real Threats

The UK Cabinet Office’s Emerging Technology Cybersecurity Division (ETCD), in close collaboration with the National Cyber Security Centre (NCSC), has publicly released results from the first full‑scale evaluation of Mythos AI, a large language model purpose‑built to separate genuine cyber threats from industry hype. The system ingests raw threat intel from sources such as MISP, AlienVault Open Threat Exchange (OTX), and classified government feeds, then applies a hybrid symbolic‑neural reasoning pipeline to score each report’s credibility and potential impact.

In a controlled benchmark known as the Multistep Infiltration Challenge (MIC), Mythos AI was required to emulate a realistic APT scenario that progressed through three distinct phases: initial reconnaissance using open‑source intelligence (OSINT) scraping, exploitation of a known vulnerability (CVE‑2023‑42793, a remote‑code‑execution flaw in a popular VPN gateway), and lateral movement via Pass‑the‑Hash credential reuse before exfiltrating a simulated data payload. The model completed the entire chain in 4.7 minutes of simulated time, achieving a success rate of 98.2 % across 1,000 randomized trial runs. NCSC analysts verified that Mythos correctly identified the exploit’s critical CVSS vector (9.8) and recommended mitigation steps, including patching to version 5.6.1 and enforcing IPsec tunnel isolation.

Beyond technical exploit detection, Mythos was tasked with a “hype‑filter” test set comprising speculative threat reports, unverified claims about quantum‑computing attacks, and misattributed ransomware incidents. By cross‑referencing each claim against a curated database of 1.2 million historic threat reports, the AI flagged 94 % of the false positives while preserving 91 % of genuine zero‑day alerts. This capability addresses a long‑standing pain point for analysts who waste an average of 12 hours per week chasing noise.

The trial results are already influencing policy discussions within the UK’s Cyber Security Strategy. ETCD director Sarah Whitmore said the model will be integrated into the NCSC’s automated threat‑intelligence pipeline by Q3 2026, supplementing human analysts with real‑time scoring and contextual risk ranking. Early adopters in the private sector, including BT Group and the UK’s largest retail banks, have expressed interest in licensing the underlying API, signaling a broader shift toward AI‑augmented threat intelligence across critical infrastructure.

Source: Ars Technica →