Anthropic said Tuesday that it is sharing a preview version of its upcoming AI model as part of a new cybersecurity initiative with a coalition of tech companies to find and fix vulnerabilities in critical software infrastructure.
The Project Glasswing initiative includes tech stalwarts like Amazon, Apple, Broadcom, Cisco, CrowdStrike, the Linux Foundation, Microsoft, and Palo Alto Networks. Anthropic said the partners will use the model for defensive security work and distribute their findings within the industry at large. The company is also extending access to roughly 40 additional organizations that build or maintain critical software infrastructure.
Fears have been growing that bad actors could use powerful AI models to develop more sophisticated cyberattacks. “The work of defending the world’s cyber infrastructure might take years; frontier AI capabilities are likely to advance substantially over just the next few months,” Anthropic said in a blog post. “For cyber defenders to come out ahead, we need to act now.”
Anthropic is committing up to $100 million worth of model usage credits to the security research, and $4 million in direct donations to open-source security organizations.
The company says it discovered strong security applications in “Claude Mythos Preview” while it was training the model for coding and reasoning skills. It says users will eventually get access to other members of the Mythos-class models.
The Mythos model has already identified thousands of zero-day vulnerabilities over recent weeks, many of them critical, Anthropic said in the blog post. The model found a 27-year-old bug in OpenBSD, an operating system known for its security. It also found a 16-year-old vulnerability in a widely used video software that automated testing tools had failed to find.
Anthropic researchers say they set the model to work trying to find and exploit weaknesses in a set of one thousand open-source software repositories. They scored the severity of these crashes from one to five, with one being basic crashes and five being complete control flow hijacks. In the same test, Mythos Preview’s predecessors–Sonnet 4.6 and Opus 4.6–each created between 150 and 175 tier one crashes, and 100 tier 2 crashes, but only a single tier 3 crash. Mythos Preview achieved 595 crashes at tiers 1 and 2, a handful of crashes at tiers 3 and 4, and achieved full control flow hijacks on 10 separate, fully patched targets (tier 5). Mythos, Anthropic said, was not specifically trained to execute any of these exploits. This ability emerged as a consequence of general improvements in coding, reasoning, and acting autonomously.
The company said it has been in ongoing discussions with U.S. government officials about the model’s offensive and defensive cyber capabilities. Anthropic framed the initiative as urgent, arguing that similar AI capabilities will soon become available to bad actors.
Anthropic was involved in a spat with the Pentagon last month over its opposition to defense contract terms that would have allowed the government to use its tech for domestic surveillance and in autonomous weapons. That feud led to the still-ongoing dissolution of their working relationship.
