Mend.io Vulnerability Database
The largest open source vulnerability database
What is a Vulnerability ID?
New vulnerability? Tell us about it!
MAI-2024-0014
Published:May 16, 2026
Updated:May 16, 2026
This vulnerability in large language models (LLMs) enables attackers to extract unsafe or unethical responses by employing a sequence of semantically connected multi-turn prompts. Known as the "Chain of Attack" (CoA), this method exploits the model's contextual comprehension and adaptive response mechanisms to incrementally guide the dialogue towards harmful outputs, circumventing single-turn prompt rejection by safety protocols. The attack utilizes semantic similarity scoring, such as SIMCSE, to strategically generate prompts that progressively align with the intended malicious objective. Mitigation steps: **For AI Developers:** * Develop robust safety mechanisms that resist multi-turn attacks through sophisticated contextual analysis and semantic drift detection. * Implement real-time monitoring and filtering of outputs to detect and block unsafe responses, regardless of prompt appearance. **For Model Trainers/Fine-tuners:** * Enhance Reinforcement Learning from Human Feedback (RLHF) to effectively manage adversarial prompts and prevent harmful responses in multi-turn contexts. * Conduct adversarial training using examples generated by techniques like CoA to improve model robustness against attacks.
Related Resources (1)
Do you need more information?
Contact Us
CVSS v4
Base Score:
8.9
Attack Vector
NETWORK
Attack Complexity
HIGH
Attack Requirements
NONE
Privileges Required
NONE
User Interaction
NONE
Vulnerable System Confidentiality
NONE
Vulnerable System Integrity
HIGH
Vulnerable System Availability
NONE
Subsequent System Confidentiality
NONE
Subsequent System Integrity
HIGH
Subsequent System Availability
NONE
CVSS v3
Base Score:
6.8
Attack Vector
NETWORK
Attack Complexity
HIGH
Privileges Required
NONE
User Interaction
NONE
Scope
CHANGED
Confidentiality
NONE
Integrity
HIGH
Availability
NONE
AIVSS
Base Score:
6