Mend.io Vulnerability Database
The largest open source vulnerability database
What is a Vulnerability ID?
New vulnerability? Tell us about it!
MAI-2024-0052
Published:May 16, 2026
Updated:May 16, 2026
This vulnerability affects aligned Large Language Models (LLMs) by enabling circumvention of their safety mechanisms through advanced few-shot jailbreaking techniques. The attack method involves the injection of specialized system tokens, such as `[/INST]`, into few-shot demonstrations. Additionally, a demo-level random search is employed to enhance the likelihood of generating harmful responses. These techniques effectively bypass defenses that rely on perplexity filtering and input perturbation, compromising the model's alignment and safety protocols. Mitigation steps: **For AI Developers:** * Implement advanced input validation and filtering mechanisms that extend beyond basic perplexity assessments. * Develop and deploy multi-layered safety defenses that incorporate strategies resilient to few-shot adversarial attacks. * Investigate and apply mechanisms to detect and neutralize the injection of specific system tokens utilized in attacks. **For Model Trainers/Fine-tuners:** * Regularly update and retrain language models using enhanced safety datasets and techniques aimed at addressing identified vulnerabilities.
Related Resources (1)
Do you need more information?
Contact Us
CVSS v4
Base Score:
6.3
Attack Vector
NETWORK
Attack Complexity
HIGH
Attack Requirements
NONE
Privileges Required
NONE
User Interaction
NONE
Vulnerable System Confidentiality
NONE
Vulnerable System Integrity
LOW
Vulnerable System Availability
NONE
Subsequent System Confidentiality
NONE
Subsequent System Integrity
LOW
Subsequent System Availability
NONE
CVSS v3
Base Score:
4
Attack Vector
NETWORK
Attack Complexity
HIGH
Privileges Required
NONE
User Interaction
NONE
Scope
CHANGED
Confidentiality
NONE
Integrity
LOW
Availability
NONE
AIVSS
Base Score:
4