Connect with us

Artificial Intelligence

Researchers say AI just broke every benchmark for autonomous cyber capability

Published

on

Washington, D.C. — Two independent research reports have raised new alarms over the rapid acceleration of artificial intelligence in cybersecurity, finding that leading frontier models are now outperforming established expectations for autonomous cyber task performance by a wide margin.

The studies, released by the UK’s AI Security Institute (AISI) and cybersecurity firm Palo Alto Networks, suggest that AI systems are completing complex cyber operations at a speed and reliability level that significantly exceeds previously observed trends.

Frontier AI Models Outpace Cyber Capability Forecasts

Researchers evaluated two of the most advanced AI systems currently in testing: Anthropic’s Claude Mythos Preview and OpenAI’s GPT-5.5. Both models reportedly surpassed existing performance growth curves that researchers had been tracking since late 2024.

The AISI had previously estimated that frontier AI systems were doubling their ability to reliably complete long-horizon cybersecurity tasks roughly every five months. However, the latest results indicate that this progression has accelerated even further, with some estimates now suggesting a doubling time closer to four months.

Officials cautioned that while the trend is clear, it remains uncertain whether this represents a sustained acceleration or a temporary spike in capability.

Advanced Cyber Ranges Reveal Unprecedented Performance

To evaluate real-world-like performance, AISI researchers used controlled “cyber ranges” that simulate multi-stage attacks on enterprise systems.

In these tests, Claude Mythos Preview became the first model to successfully complete both of the institute’s most challenging scenarios. It managed to solve a 32-step simulated breach scenario titled “The Last Ones” in six out of ten attempts and completed the previously unsolved “Cooling Tower” simulation in three out of ten attempts.

GPT-5.5 also demonstrated strong performance, successfully completing “The Last Ones” in three out of ten trials, marking a significant leap in autonomous problem-solving capabilities in cybersecurity environments.

Industry Testing Shows Rising Vulnerability Discovery Rates

Independent assessments by Palo Alto Networks echoed the findings, reporting that advanced AI models are now capable of identifying and mapping vulnerabilities across software systems at unprecedented scale.

The company said its testing program, conducted through partnerships involving Anthropic’s Project Glasswing and OpenAI’s Trusted Access for Cyber initiative, revealed that AI systems are increasingly able to translate vulnerabilities into actionable exploit pathways in near real time.

In one case, AI-assisted scanning across more than 130 products led to the discovery of 26 CVEs, significantly exceeding typical monthly discovery rates.

Security Experts Urge Rapid Defensive Adaptation

Cybersecurity researchers and industry leaders are warning that the accelerating pace of AI-driven capability growth could significantly reshape both offensive and defensive cyber operations.

Palo Alto Networks emphasized that organizations must prioritize proactive vulnerability patching, reduction of attack surfaces, and faster incident response systems capable of operating in near real time.

Experts also highlighted the growing importance of integrating AI into defensive security operations to match the speed of potential AI-enabled attacks.

Uncertainty Over Long-Term Trajectory

Despite the striking results, the AI Security Institute stressed that its findings are based on a limited set of models and controlled environments. Researchers noted that benchmark uncertainty increases at the highest difficulty levels, making precise forecasting challenging.

Still, independent analysis from multiple research groups, including METR, suggests a consistent pattern: AI systems are rapidly improving their ability to execute complex software and cybersecurity tasks at a rate that continues to accelerate.

Growing Implications for Global Cybersecurity

The findings arrive at a time when governments and private companies are already grappling with the implications of increasingly autonomous AI systems. Security analysts warn that faster and more capable AI could dramatically shorten the timeline for cyberattacks, forcing organizations to respond within minutes rather than hours or days.

Researchers say future evaluation frameworks will need to evolve quickly to keep pace with emerging capabilities, including more realistic cyber defense scenarios and live-system testing environments.

Advertisement
Click to comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Copyright © 2023 Cyber Reports Cyber Security News All Rights Reserved Website by Top Search SEO