MAI-2024-0064 | Mend Vulnerability Database

Vulnerability DatabaseMAI-2024-0064

MAI-2024-0064

Published:May 16, 2026

Updated:June 17, 2026

Large Language Models (LLMs) are susceptible to a form of attack known as "editing attacks," wherein adversaries manipulate the model's knowledge base to introduce misinformation or bias. These attacks exploit existing knowledge editing techniques to subtly modify the model's internal representations, resulting in outputs that reflect the injected content, even when responding to unrelated prompts. Such attacks can be highly covert, causing minimal disruption to the model's overall performance across other functionalities. Mitigation steps: **For AI Developers:** * Implement robust detection mechanisms to identify compromised LLMs by comparing outputs across multiple instances and analyzing internal weights for anomalies. * Establish stricter validation and verification procedures for LLM deployments to ensure integrity and security. * Develop techniques to roll back or repair models that have been tampered with, ensuring system resilience and recovery. **For Model Trainers/Fine-tuners:** * Enhance LLMs' resistance to manipulation through improved model architectures and advanced training methods. * Increase public awareness of vulnerabilities related to LLMs to promote informed usage and proactive security measures.

Related Resources (1)

https://arxiv.org/abs/2407.20224

Do you need more information?

CVSS v4

Base Score:

1.8

Attack Vector

LOCAL

Attack Complexity

HIGH

Attack Requirements

NONE

Privileges Required

HIGH

User Interaction

NONE

Vulnerable System Confidentiality

NONE

Vulnerable System Integrity

LOW

Vulnerable System Availability

NONE

Subsequent System Confidentiality

NONE

Subsequent System Integrity

LOW

Subsequent System Availability

NONE

CVSS v3

Base Score:

2.5

Attack Vector

LOCAL

Attack Complexity

HIGH

Privileges Required

HIGH

User Interaction

NONE

Scope

CHANGED

Confidentiality

NONE

Integrity

LOW

Availability

NONE

AIVSS

Base Score:

2.1