MAI-2024-0027 | Mend Vulnerability Database

Vulnerability DatabaseMAI-2024-0027

MAI-2024-0027

Published:May 16, 2026

Updated:June 17, 2026

Large Language Models (LLMs) that utilize gradient-ascent-based unlearning techniques are susceptible to a Dynamic Unlearning Attack (DUA). This attack exploits strategically crafted adversarial suffixes appended to input prompts, effectively reintroducing knowledge that was intended to be forgotten. Notably, this method does not require access to the model's parameters, thereby enabling attackers to retrieve sensitive information that was previously earmarked for removal. Mitigation steps: **For AI Developers:** * [Develop and deploy robust detection mechanisms to identify and filter malicious prompts attempting to recover unlearned knowledge] * [Monitor model behavior for unexpected outputs related to unlearned topics] **For Model Trainers/Fine-tuners:** * [Implement the Latent Adversarial Unlearning (LAU) framework to enhance the robustness of the unlearning process] * [Integrate techniques like adversarial training during the unlearning phase to make the model more resistant to adversarial queries] * [Regularly update and retrain LLMs using improved unlearning methods to minimize vulnerabilities]

Related Resources (1)

https://arxiv.org/abs/2408.10682

Do you need more information?

CVSS v4

Base Score:

8.7

Attack Vector

NETWORK

Attack Complexity

LOW

Attack Requirements

NONE

Privileges Required

NONE

User Interaction

NONE

Vulnerable System Confidentiality

HIGH

Vulnerable System Integrity

NONE

Vulnerable System Availability

NONE

Subsequent System Confidentiality

NONE

Subsequent System Integrity

NONE

Subsequent System Availability

NONE

CVSS v3

Base Score:

7.5

Attack Vector

NETWORK

Attack Complexity

LOW

Privileges Required

NONE

User Interaction

NONE

Scope

UNCHANGED

Confidentiality

HIGH

Integrity

NONE

Availability

NONE

AIVSS

Base Score:

5.2