MAI-2025-0008 | Mend Vulnerability Database

Vulnerability DatabaseMAI-2025-0008

MAI-2025-0008

Published:May 16, 2026

Updated:June 17, 2026

Multimodal Large Language Models (MLLMs) are susceptible to a sophisticated attack vector that exploits narrative-driven visual storytelling and role immersion to bypass inherent safety protocols. This attack, known as MIRAGE, strategically decomposes malicious queries into triplets consisting of environment, character, and activity. It then generates a sequence of images and text prompts to guide the MLLM through a misleading narrative, ultimately provoking harmful outputs. The attack effectively leverages the MLLM's cross-modal reasoning capabilities and its vulnerability to persona-based manipulation. Mitigation steps: **For AI Developers:** * Implement pre-screening mechanisms using vision-language models to analyze visual inputs for potentially harmful content before processing by the MLLM. * Develop sophisticated detection methods to identify attempts at role immersion and deceptive storytelling. **For Model Trainers/Fine-tuners:** * Improve the robustness of MLLM safety mechanisms to handle multi-turn interactions and narrative contexts. * Enhance the training data used for MLLM safety reinforcement by including examples of narrative-driven attacks.

Related Resources (1)

https://arxiv.org/abs/2503.19134

Do you need more information?

CVSS v4

Base Score:

8.2

Attack Vector

NETWORK

Attack Complexity

HIGH

Attack Requirements

NONE

Privileges Required

NONE

User Interaction

NONE

Vulnerable System Confidentiality

NONE

Vulnerable System Integrity

HIGH

Vulnerable System Availability

NONE

Subsequent System Confidentiality

NONE

Subsequent System Integrity

NONE

Subsequent System Availability

NONE

CVSS v3

Base Score:

5.9

Attack Vector

NETWORK

Attack Complexity

HIGH

Privileges Required

NONE

User Interaction

NONE

Scope

UNCHANGED

Confidentiality

NONE

Integrity

HIGH

Availability

NONE

AIVSS

Base Score:

5.4