The UK's AI Security Institute has published an initial evaluation of Anthropic’s new Mythos Preview model, revealing that while it excels at individual cybersecurity tasks, its real strength lies in chaining these tasks into complex attack sequences.
Mythos can now complete over 85% of the group’s Apprentice-level Capture the Flag challenges, a significant step up from earlier models like GPT-3.5 Turbo and its contemporaries such as GPT-5.4.
However, it's in the 'The Last Ones' test that Mythos truly shines, simulating a 32-step data extraction attack on a corporate network—a task that would take a human around 20 hours to complete. This highlights the increasing complexity of AI-driven cyber threats and raises questions about our preparedness.
With Mythos set for a limited release to critical industry partners, the race is on to see if this model can indeed outmanoeuvre both humans and other AI systems in the digital battlegrounds of tomorrow.







