Ptechhub
  • News
  • Industries
    • Enterprise IT
    • AI & ML
    • Cybersecurity
    • Finance
    • Telco
  • Brand Hub
    • Lifesight
  • Blogs
No Result
View All Result
  • News
  • Industries
    • Enterprise IT
    • AI & ML
    • Cybersecurity
    • Finance
    • Telco
  • Brand Hub
    • Lifesight
  • Blogs
No Result
View All Result
PtechHub
No Result
View All Result

Leading AI models are more vulnerable to malicious prompts than vendors claim

By CIO Dive by By CIO Dive
May 27, 2026
Home Enterprise IT
Share on FacebookShare on Twitter


This audio is auto-generated. Please let us know if you have feedback.

Dive Brief:

  • Major AI developers’ model-safety claims rest on incorrect assumptions about how hackers behave, Cisco researchers said in a report published on Wednesday.
  • AI vendors assume that their models are safe from hijacking if they can fend off a single malicious prompt at a time, but hackers are increasingly using multistage prompts to evade model defenses, Cisco said, and most models aren’t prepared for those kinds of attacks.
  • The new report illustrates a mostly underappreciated danger lurking inside AI models, one that could expose businesses using these tools to a wide range of disruptions and harm.

Dive Insight:

Cisco’s evaluation of 15 leading AI models from OpenAI, Anthropic, Google, Amazon and xAI “found that single-turn attack success rate (ASR) is not a reliable proxy for what happens when an attacker can adapt across turns,” researchers Nicholas Conley and Amy Chang wrote.

Their tests revealed that AI models were much more susceptible to multi-turn malicious prompts — success rates ranged from 8% to 88%, compared with a range of 2% to 65% for single-turn prompts.

“Every model we tested exhibited non-trivial multi-turn ASR,” Conley and Chang wrote.

The two researchers previously collaborated on a November 2025 report that found open-weight AI models were between two and 10 times as vulnerable to multi-turn attacks as they were to single-turn attacks.

“The pattern we documented in open models holds in closed ones,” they wrote in their new study. “No frontier closed model in this cohort can be characterized as safe under iterative attack. This is a claim about the current state of the closed-model frontier, not about any single vendor.”

One of the study’s most significant findings was a correlation between AI companies’ priorities and their models’ safety. Conley and Chang found that AI developers that publicly emphasized their models’ increasing power produced models with the biggest gap between vulnerability to single-turn attacks and vulnerability to multi-turn attacks. Developers whose public statements emphasized model safety had smaller disparities, suggesting a more concerted effort to minimize risks.

The researchers tested five strategies: role-playing, misdirecting models, information decomposition, reframing model refusals and incremental escalation. An xAI model, Grok 4.1 Fast Non-Reasoning, performed the worst, with researchers succeeding in 88% of their multi-turn attacks. (They succeeded in 34% of single-turn attacks against the model.)

The best-performing model, Amazon’s Nova 2 Lite, only failed to withstand 8% of multistage attacks, although the researchers said that that figure “still represents meaningful residual risk.”

Conley and Chang noted that Grok 4.1 performed significantly better with reasoning enabled, suggesting that AI vendors should “document the safety-relevant effects” of configuration decisions like reasoning status.

OpenAI, Anthropic, Google, Amazon and xAI did not immediately respond to requests for comment.

Vendors need to rethink how they evaluate AI model safety, the researchers said, and businesses need more information about potential gaps between models’ single-turn and multi-turn attack resilience.

“For business decisions made on the basis of published single-turn scores, this presents security and governance risk,” Conley and Chang wrote. “A model with 2.74% single-turn ASR is not the same product as a model that holds the line at 24.68% multi-turn ASR. Without paired-regime data, the two are indistinguishable on most public evaluations, and the end user never sees the gap.”



Source link

By CIO Dive

By CIO Dive

Next Post
Coveo Reports Fourth Quarter and Fiscal 2026 Financial Results

Coveo Reports Fourth Quarter and Fiscal 2026 Financial Results

Recommended.

Denodo expands AWS integrations to power agentic AI with governed enterprise data | Computer Weekly

Denodo expands AWS integrations to power agentic AI with governed enterprise data | Computer Weekly

May 22, 2026
Verizon’s investments in AI-driven personalization help fuel growth

Verizon’s investments in AI-driven personalization help fuel growth

January 29, 2025

Trending.

Pia Debuts Automation Hub, A Centralized Marketplace For MSPs: Exclusive

Pia Debuts Automation Hub, A Centralized Marketplace For MSPs: Exclusive

November 19, 2025
Veeam Debuts Data Resiliency Maturity Model To Assess, Improve Customers’ Cyber Resiliency

Veeam Debuts Data Resiliency Maturity Model To Assess, Improve Customers’ Cyber Resiliency

April 23, 2025
Microsoft Vs. AWS Vs. Google Cloud Earnings Q1 2025 Face-Off

Microsoft Vs. AWS Vs. Google Cloud Earnings Q1 2025 Face-Off

May 5, 2025
Many workers would take a pay cut to work from home — some would forgo at least 20% of their salary

Many workers would take a pay cut to work from home — some would forgo at least 20% of their salary

February 7, 2025
Insurance Modernization at Risk as Workforce Strategies Fall Behind, Says Info-Tech Research Group

Insurance Modernization at Risk as Workforce Strategies Fall Behind, Says Info-Tech Research Group

May 8, 2026

PTechHub

A tech news platform delivering fresh perspectives, critical insights, and in-depth reporting — beyond the buzz. We cover innovation, policy, and digital culture with clarity, independence, and a sharp editorial edge.

Follow Us

Industries

  • AI & ML
  • Cybersecurity
  • Enterprise IT
  • Finance
  • Telco

Navigation

  • About
  • Advertise
  • Privacy & Policy
  • Contact

Subscribe to Our Newsletter

  • About
  • Advertise
  • Privacy & Policy
  • Contact

Copyright © 2025 | Powered By Porpholio

No Result
View All Result
  • News
  • Industries
    • Enterprise IT
    • AI & ML
    • Cybersecurity
    • Finance
    • Telco
  • Brand Hub
    • Lifesight
  • Blogs

Copyright © 2025 | Powered By Porpholio