MAI-2025-0019 | Mend Vulnerability Database

Vulnerability DatabaseMAI-2025-0019

MAI-2025-0019

Published:May 16, 2026

Updated:June 17, 2026

This vulnerability pertains to several open-source Large Language Models (LLMs), which are susceptible to adversarial attacks through the use of exponentiated gradient descent techniques. Attackers can exploit this vulnerability by crafting adversarial prompts that manipulate the models into generating harmful or unintended outputs, effectively circumventing the safety alignment mechanisms designed to prevent such occurrences. The attack leverages a continuous relaxed one-hot encoding of input tokens, thereby inherently satisfying constraints and eliminating the necessity for projection techniques that were prevalent in prior methods. Mitigation steps: **For AI Developers:** * Restrict access to model weights to mitigate white-box attack risks. * Implement input sanitization and filtering to reduce the impact of potential attacks. **For Model Trainers/Fine-tuners:** * Explore improved regularization techniques and stronger safety training to enhance model robustness. * Conduct further research into optimization-based attacks to identify more effective mitigation strategies.

Related Resources (1)

https://arxiv.org/abs/2505.09820

Do you need more information?

CVSS v4

Base Score:

6.3

Attack Vector

NETWORK

Attack Complexity

HIGH

Attack Requirements

NONE

Privileges Required

NONE

User Interaction

NONE

Vulnerable System Confidentiality

NONE

Vulnerable System Integrity

LOW

Vulnerable System Availability

NONE

Subsequent System Confidentiality

NONE

Subsequent System Integrity

LOW

Subsequent System Availability

NONE

CVSS v3

Base Score:

Attack Vector

NETWORK

Attack Complexity

HIGH

Privileges Required

NONE

User Interaction

NONE

Scope

CHANGED

Confidentiality

NONE

Integrity

LOW

Availability

NONE

AIVSS

Base Score: