Mend.io Vulnerability Database
The largest open source vulnerability database
What is a Vulnerability ID?
New vulnerability? Tell us about it!
MAI-2025-0011
Published:May 16, 2026
Updated:May 16, 2026
Large Language Models (LLMs) are susceptible to jailbreak attacks that exploit their inherent persuasive capabilities. The novel attack framework, CL-GSO, systematically decomposes jailbreak strategies into four distinct components: Role, Content Support, Context, and Communication Skills. This decomposition results in a significantly expanded strategy space compared to previous methodologies. Such an expanded space facilitates the creation of prompts that successfully bypass safety protocols, achieving a success rate of over 90% on models previously deemed resistant, such as Claude-3.5. The vulnerability is rooted in the LLM's reasoning and response generation mechanisms, which can be manipulated through strategically crafted prompts utilizing these four components. Mitigation steps: **For AI Developers:** * Develop robust safety mechanisms resistant to diverse attack strategies, focusing on both content and underlying persuasive intent. * Employ sophisticated filtering techniques that analyze prompts for underlying persuasive intent, beyond keyword-based or content-based filtering. **For Model Trainers/Fine-tuners:** * Expand training data for safety alignment to include a wider variety of adversarial prompts and attack strategies, such as those from the CL-GSO framework. * Conduct regular security audits of LLMs using diverse and advanced adversarial testing methods to identify and address vulnerabilities.
Related Resources (1)
Do you need more information?
Contact Us
CVSS v4
Base Score:
6.9
Attack Vector
NETWORK
Attack Complexity
LOW
Attack Requirements
NONE
Privileges Required
NONE
User Interaction
NONE
Vulnerable System Confidentiality
LOW
Vulnerable System Integrity
LOW
Vulnerable System Availability
NONE
Subsequent System Confidentiality
NONE
Subsequent System Integrity
LOW
Subsequent System Availability
NONE
CVSS v3
Base Score:
7.2
Attack Vector
NETWORK
Attack Complexity
LOW
Privileges Required
NONE
User Interaction
NONE
Scope
CHANGED
Confidentiality
LOW
Integrity
LOW
Availability
NONE
AIVSS
Base Score:
4.6