MAI-2025-0014 | Mend Vulnerability Database

Vulnerability DatabaseMAI-2025-0014

MAI-2025-0014

Published:May 16, 2026

Updated:June 17, 2026

Large Language Models (LLMs) are susceptible to sophisticated multi-turn jailbreak attacks that exploit their reasoning capabilities. The attack, known as Reasoning-Augmented Conversation Exploit (RACE), transforms harmful queries into seemingly benign reasoning tasks. By leveraging the LLM's advanced reasoning abilities, attackers can ultimately induce the model to generate unsafe content. This method effectively circumvents standard safety mechanisms designed to prevent the creation of harmful responses. Mitigation steps: **For AI Developers:** * [Implement a layered security approach with multiple safety checks at different stages of the query processing pipeline] * [Strengthen safety mechanisms beyond simple keyword filtering to detect and prevent reasoning-based attacks] **For Model Trainers/Fine-tuners:** * [Develop robust models that can identify and resist manipulation of their reasoning processes] * [Develop more sophisticated detection methods to identify reasoning-based attacks, including analysis of information gain during conversation] * [Conduct rigorous red-teaming and adversarial testing to identify and address vulnerabilities before deployment]

Related Resources (1)

https://arxiv.org/abs/2502.11054

Do you need more information?

CVSS v4

Base Score:

6.9

Attack Vector

NETWORK

Attack Complexity

LOW

Attack Requirements

NONE

Privileges Required

NONE

User Interaction

NONE

Vulnerable System Confidentiality

NONE

Vulnerable System Integrity

LOW

Vulnerable System Availability

NONE

Subsequent System Confidentiality

NONE

Subsequent System Integrity

LOW

Subsequent System Availability

NONE

CVSS v3

Base Score:

5.8

Attack Vector

NETWORK

Attack Complexity

LOW

Privileges Required

NONE

User Interaction

NONE

Scope

CHANGED

Confidentiality

NONE

Integrity

LOW

Availability

NONE

AIVSS

Base Score:

4.8