MAI-2024-0024 | Mend Vulnerability Database

Vulnerability DatabaseMAI-2024-0024

MAI-2024-0024

Published:May 16, 2026

Updated:June 17, 2026

Large Language Models (LLMs) are susceptible to a sophisticated attack known as "bijection learning," which exploits in-context learning to teach the model a custom string-to-string encoding. This method effectively bypasses the model's inherent safety mechanisms by encoding malicious queries, transmitting them to the model, and subsequently decoding the responses. The attack's complexity can be adjusted to suit different LLMs, with more advanced models being more vulnerable to intricate encoding schemes. Mitigation steps: **For AI Developers:** * Implement advanced input/output filtering systems to detect and block encoded malicious prompts and responses, even if they are not explicitly flagged as harmful. * Develop mechanisms to identify and mitigate computational overload caused by complex encoding processing, potentially through resource limiting or computational complexity analysis. **For Model Trainers/Fine-tuners:** * Enhance LLM safety mechanisms to ensure robustness against in-context learning of arbitrary encodings. * Conduct regular evaluations of LLMs to identify vulnerabilities against novel attacks, including those exploiting in-context learning and encoding techniques.

Related Resources (1)

https://arxiv.org/abs/2410.01294

Do you need more information?

CVSS v4

Base Score:

8.7

Attack Vector

NETWORK

Attack Complexity

LOW

Attack Requirements

NONE

Privileges Required

NONE

User Interaction

NONE

Vulnerable System Confidentiality

NONE

Vulnerable System Integrity

HIGH

Vulnerable System Availability

NONE

Subsequent System Confidentiality

NONE

Subsequent System Integrity

NONE

Subsequent System Availability

NONE

CVSS v3

Base Score:

7.5

Attack Vector

NETWORK

Attack Complexity

LOW

Privileges Required

NONE

User Interaction

NONE

Scope

UNCHANGED

Confidentiality

NONE

Integrity

HIGH

Availability

NONE

AIVSS

Base Score:

6.4