Researchers from Anthropic say they uncovered what appears to be the first case where an artificial intelligence system was used to run a cyberattack campaign with a high degree of automation. The company reports it disrupted the operation, which its investigators link to the Chinese government.
According to Anthropic, the AI took on tasks that would normally require skilled human operators, allowing the attackers to scale their efforts. “While we predicted these capabilities would continue to evolve, what has stood out to us is how quickly they have done so at scale,” they wrote in their report.
The campaign focused on firms and agencies in sectors such as technology, finance, chemicals, and government. Anthropic says roughly thirty global targets were probed and that attackers found limited success in a few instances. The company detected the activity in September, moved to stop it, and informed affected organisations.
Anthropic warned that so‑called AI “agents” — systems that can access tools and take actions beyond simple chat can be repurposed by hostile groups. “Agents are valuable for everyday work and productivity but in the wrong hands, they can substantially increase the viability of large-scale cyberattacks,” the researchers concluded. “These attacks are likely to only grow in their effectiveness.”
The report describes how the attackers used “jailbreaking” methods to trick Anthropic’s Claude model into ignoring safety limits, posing as staff of a legitimate cybersecurity firm to bypass protections. That tactic highlights a broader problem in current models. “This points to a big challenge with AI models, and it’s not limited to Claude, which is that the models have to be able to distinguish between what’s actually going on with the ethics of a situation and the kinds of role-play scenarios that hackers and others may want to cook up,” said John Scott-Railton, senior researcher at Citizen Lab.
Industry observers have raised similar alarms. Microsoft has warned that state-backed actors are turning to AI to make campaigns faster and less labor-intensive, and members of OpenAI’s safety board have said they are watching for new systems that could boost attackers’ capabilities.
Experts note that automating attacks lowers the bar for smaller groups and lone actors, who might now run much larger operations. “The speed and automation provided by the AI is what is a bit scary,” Arellano said, “Instead of a human with well‑honed skills attempting to hack into hardened systems, the AI is speeding those processes and more consistently getting past obstacles.”
At the same time, defenders are turning to AI to detect and block such threats, underscoring that automation will shape both offense and defence in cyber security going forward.
Leave a comment