MAI-2024-0042
Published:May 16, 2026
Updated:May 16, 2026
Large Language Models (LLMs) possessing advanced reasoning capabilities are susceptible to jailbreaking attacks through the use of novel, complex, and layered custom encryption schemes. These models' proficiency in deciphering such ciphers, surpassing the abilities of less sophisticated models, allows attackers to bypass established safety protocols by encoding malicious prompts.
Mitigation steps: **For AI Developers:**
* [Develop and integrate defense mechanisms to specifically target and mitigate advanced encoding schemes beyond common methods like Base64.]
* [Conduct regular red-teaming of LLMs using novel encryption and prompt engineering techniques to proactively identify and address vulnerabilities.]
**For Model Trainers/Fine-tuners:**
* [Enhance safety training datasets to include a broader range of complex and layered encryption techniques used to obfuscate malicious prompts.]
* [Implement detailed analysis of the model's intermediate processing steps during decryption to detect potential malicious intent, even if the final output appears benign.]
Related Resources (1)
Do you need more information?
Contact UsCVSS v4
Base Score:
8.2
Attack Vector
NETWORK
Attack Complexity
HIGH
Attack Requirements
NONE
Privileges Required
NONE
User Interaction
NONE
Vulnerable System Confidentiality
NONE
Vulnerable System Integrity
HIGH
Vulnerable System Availability
NONE
Subsequent System Confidentiality
NONE
Subsequent System Integrity
NONE
Subsequent System Availability
NONE
CVSS v3
Base Score:
5.9
Attack Vector
NETWORK
Attack Complexity
HIGH
Privileges Required
NONE
User Interaction
NONE
Scope
UNCHANGED
Confidentiality
NONE
Integrity
HIGH
Availability
NONE
AIVSS
Base Score:
5.4