Threat actors are abusing AI tools in increasingly sophisticated ways, including exploit development and attack orchestration.
Google today published new research tracking how adversaries leverage AI in their cyber operations. Since large language model (LLM) tools became widely available, threat actors have leveraged the technology in a wide range of ways, such as crafting phishing lures, coding malware, and conducting reconnaissance. They are also using AI, as Google detailed, for vulnerability research and exploit development.
This research arrives as defenders contemplate how Anthropic’s Claude Mythos model (and by extension Project Glasswing) will reshape the security ecosystem for years to come, as Anthropic claims Mythos is capable of finding critical zero-day vulnerabilities using natural language instruction.
While this report doesn’t claim threat actors are using anything like Mythos, Google’s Threat Intelligence Group (GTIG) covers some of the cutting-edge ways attackers are using AI today.
No Mythos Needed: Exploit Developed With AI
For example, GTIG said it identified a threat actor using a zero-day exploit the company believes was developed with AI – possibly the first of its kind. According to the report, the vulnerability is “implemented in a Python script that enables the user to bypass two-factor authentication (2FA) on a popular open-source, web-based system administration tool.” The vulnerability requires valid user credentials to exploit.
Although the threat actor was (or possibly is) planning to use the vulnerability on a massive scale, GTIG disclosed the bug to the appropriate vendor in the hopes of disrupting potential threat activity.
“Although we do not believe Gemini was used, based on the structure and content of these exploits, we have high confidence that the actor likely leveraged an AI model to support the discovery and weaponization of this vulnerability,” the report read. “For example, the script contains an abundance of educational docstrings, including a hallucinated CVSS score, and uses a structured, textbook Pythonic format highly characteristic of LLMs training data (e.g., detailed help menus and the clean _C ANSI color class).”
Threat actors associated with China and North Korea have shown particular interest in using LLMs for vulnerability research. For example, GTIG has observed suspected Chinese actor UNC2814 prompting Gemini to take on the role of a network security researcher conducting vulnerability research into embedded devices like TP-Link firmware. The actor tells the AI they are “auditing it for pre-authentication remote code execution (RCE) vulnerabilities.”
North Korean actor Silent Chollima, also known as APT45, has been observed “sending thousands of repetitive prompts that recursively analyze different CVEs and validate PoC exploits.” This, Google said, facilitates more robust exploit capabilities than the model would have otherwise. Threat actors have similarly trained on a specialized vulnerability repository known as “wooyun-legacy” with more than 85,000 real world vulnerability cases collected by the Chinese bug bounty platform WooYun between 2010 and 2016.
Threat actors are also experimenting with agentic tools like OpenClaw and OneClaw to assist in vulnerability research.
AI-Powered Attack Orchestration
But one of the most striking use cases detailed in the report involved the use of AI in orchestrating attacks, as detailed with a malware family known as “PromptSpy.” This is an Android backdoor first detailed by ESET, which abuses Gemini by prompting it to ensure the malicious app remains in the “recent apps” list.
GTIG’s analysis found that the backdoor used AI for other purposes, primarily “centered around navigating the Android user interface and autonomously interpreting real-time user activity for follow-on actions.” For example, it can capture biometric data to replay authentication gestures to regain access to a compromised device.
Moreover, threat actors are using agentic workflows to “operationalize autonomous frameworks to execute multi-stage security tasks.” A China-nexus actor deployed agentic tools in an attack against a Japanese technology firm and an East Asian cybersecurity platform, according to the report. Agentic tools like Hextrike and Strix were used to maintain persistence across the attack surface and to both automate and validate vulnerabilities.
“This combination of autonomous reconnaissance and automated verification suggests a transition toward AI-driven frameworks that can scale discovery activities with minimal human oversight,” GTIG said.
While slight and in limited cases, it is noteworthy to see threat actors move from heavily human-focused operations to campaigns where the AI takes more control. This mirrors the progression of AI in the defender space, where some organizations are moving away from human-in-the-loop thinking and toward human-on-the-loop, where agents are the primary AI orchestrators making moment-to-moment decisions and humans only intervene when necessary.
John Hultquist, chief analyst of Google Threat Intelligence Group, tells Dark Reading that threat actors are already taking advantage of AI to enhance the speed and scale of their operations, and defenders must evolve lest they find themselves “facing machine time threats at human speed.”
“If defenders do not incorporate AI into their defenses, they will eventually find themselves dealing with a deluge of alerts and incidents, and they will struggle to keep up with an adversary that can operate faster than their patch cycle and quickly move laterally across their networks,” Hultquist says.
Don’t miss the latest Dark Reading Confidential podcast, How the Story of a USB Penetration Test Went Viral. Two decades ago Dark Reading posted its first blockbuster piece — a column by a pen tester who sprinkled rigged thumb drives around a credit union parking lot and let curious employees do the rest. This episode looks back at the history-making piece with its author, Steve Stasiukonis. Listen now!
