MAI-2024-0047 | Mend Vulnerability Database

Vulnerability DatabaseMAI-2024-0047

MAI-2024-0047

Published:May 20, 2026

Updated:May 20, 2026

Large Language Models (LLMs) are susceptible to optimization-based jailbreaking attacks that exploit index gradients during the iterative generation of adversarial suffixes. The vulnerability arises from the inefficient exploration of the token space in existing methods, such as Greedy Coordinate Gradient (GCG), which indiscriminately samples tokens for replacement without considering gradient values. This approach results in redundant computations and a sluggish optimization process, thereby compromising the model's ability to maintain secure outputs. Mitigation steps: **For AI Developers:** * Develop and implement robust safety mechanisms that are less susceptible to gradient-based attacks, potentially involving techniques beyond simple gradient-based filtering. * Conduct periodic security audits and red-teaming exercises to identify and address potential vulnerabilities in deployed LLMs. **For Model Trainers/Fine-tuners:** * Prioritize token replacement based on gradient values, focusing on tokens with positive gradients to reduce computational overhead. * Implement strategies to simultaneously update multiple tokens in each iteration, accelerating the optimization process.

Related Resources (1)

https://arxiv.org/abs/2412.08615

Do you need more information?

CVSS v4

Base Score:

6.3

Attack Vector

NETWORK

Attack Complexity

HIGH

Attack Requirements

NONE

Privileges Required

NONE

User Interaction

NONE

Vulnerable System Confidentiality

NONE

Vulnerable System Integrity

LOW

Vulnerable System Availability

NONE

Subsequent System Confidentiality

NONE

Subsequent System Integrity

LOW

Subsequent System Availability

NONE

CVSS v3

Base Score:

Attack Vector

NETWORK

Attack Complexity

HIGH

Privileges Required

NONE

User Interaction

NONE

Scope

CHANGED

Confidentiality

NONE

Integrity

LOW

Availability

NONE

AIVSS

Base Score:

3.8