MAI-2025-0001 | Mend Vulnerability Database

Vulnerability DatabaseMAI-2025-0001

MAI-2025-0001

Published:May 16, 2026

Updated:June 17, 2026

Large Language Model (LLM) agents that utilize external tools are susceptible to indirect prompt injection (IPI) attacks. These attacks involve embedding malicious instructions within the external data accessed by the agent, thereby manipulating its behavior. This vulnerability persists even when defenses against direct prompt injection are implemented. Adaptive attacks, which dynamically alter the injected payload to circumvent specific defense mechanisms, have demonstrated a consistent success rate exceeding 50%, effectively bypassing current security measures. Mitigation steps: **For AI Developers:** * Implement robust input sanitization and validation for all data sources accessed by the agent, ensuring techniques are in place to neutralize potentially malicious instructions. * Employ multi-layered defenses, including detection and input-level modification techniques, to enhance resilience by combining different approaches. * Restrict the capabilities of integrated tools, especially those with access to sensitive data, by limiting their functionality. * Actively monitor the agent's behavior for suspicious activity and implement auditing mechanisms to quickly detect and respond to potential compromises. **For Model Trainers/Fine-tuners:** * Thoroughly test all defense mechanisms against adaptive attacks before deployment, simulating real-world scenarios and considering both direct harm and data exfiltration vectors.

Related Resources (1)

https://arxiv.org/abs/2503.00061

Do you need more information?

CVSS v4

Base Score:

8.8

Attack Vector

NETWORK

Attack Complexity

LOW

Attack Requirements

NONE

Privileges Required

NONE

User Interaction

NONE

Vulnerable System Confidentiality

HIGH

Vulnerable System Integrity

LOW

Vulnerable System Availability

NONE

Subsequent System Confidentiality

LOW

Subsequent System Integrity

NONE

Subsequent System Availability

NONE

CVSS v3

Base Score:

9.3

Attack Vector

NETWORK

Attack Complexity

LOW

Privileges Required

NONE

User Interaction

NONE

Scope

CHANGED

Confidentiality

HIGH

Integrity

LOW

Availability

NONE

AIVSS

Base Score:

6.9