Mend.io Vulnerability Database
The largest open source vulnerability database
What is a Vulnerability ID?
New vulnerability? Tell us about it!
MAI-2025-0009
Published:May 16, 2026
Updated:May 16, 2026
This vulnerability in Large Language Models (LLMs) enables adversarial reasoning attacks that circumvent established safety protocols, resulting in the generation of harmful responses. The root cause of this vulnerability is the inadequate robustness of current LLM safety measures against iterative prompt refinement. This process is guided by a loss function that evaluates the model's proximity to producing a predetermined harmful output. Consequently, attackers can effectively explore the prompt space, even when confronting adversarially trained models, leading to successful jailbreaks. Mitigation steps: **For AI Developers:** * Implement advanced safety mechanisms that resist iterative prompt refinement and loss function optimization. * Deploy sophisticated detection systems for identifying adversarial reasoning attacks. **For Model Trainers/Fine-tuners:** * Enhance existing defenses by integrating insights from adversarial attacks to bolster model robustness and safety. * Regularly update and retrain LLMs using adversarial examples to strengthen resilience.
Related Resources (1)
Do you need more information?
Contact Us
CVSS v4
Base Score:
8.2
Attack Vector
NETWORK
Attack Complexity
HIGH
Attack Requirements
NONE
Privileges Required
NONE
User Interaction
NONE
Vulnerable System Confidentiality
NONE
Vulnerable System Integrity
HIGH
Vulnerable System Availability
NONE
Subsequent System Confidentiality
NONE
Subsequent System Integrity
NONE
Subsequent System Availability
NONE
CVSS v3
Base Score:
5.9
Attack Vector
NETWORK
Attack Complexity
HIGH
Privileges Required
NONE
User Interaction
NONE
Scope
UNCHANGED
Confidentiality
NONE
Integrity
HIGH
Availability
NONE
AIVSS
Base Score:
5.4