Mend.io Vulnerability Database
The largest open source vulnerability database
What is a Vulnerability ID?
New vulnerability? Tell us about it!
MAI-2025-0018
Published:May 16, 2026
Updated:May 16, 2026
The VERA framework employs variational inference to generate diverse and fluent adversarial prompts capable of circumventing the safety mechanisms embedded within large language models (LLMs). By training an attacker model through a variational objective, VERA learns a distribution of prompts that are likely to provoke harmful responses, effectively enabling the jailbreak of the target LLM. This approach facilitates the creation of novel attacks that do not rely on pre-existing, manually crafted prompts, thereby enhancing the potential for exploitation. Mitigation steps: **For AI Developers:** * Implement dynamic defenses that adapt to emerging attack strategies. * Utilize internal representation-based defenses, such as circuit breakers, to detect and mitigate harmful outputs. * Regularly audit and update safety mechanisms to address newly discovered vulnerabilities. **For Model Trainers/Fine-tuners:** * Enhance the training data for safety filters by incorporating a diverse set of adversarial examples. * Develop more robust detection mechanisms that surpass simple keyword matching or perplexity scoring.
Related Resources (1)
Do you need more information?
Contact Us
CVSS v4
Base Score:
6.3
Attack Vector
NETWORK
Attack Complexity
HIGH
Attack Requirements
NONE
Privileges Required
NONE
User Interaction
NONE
Vulnerable System Confidentiality
NONE
Vulnerable System Integrity
LOW
Vulnerable System Availability
NONE
Subsequent System Confidentiality
NONE
Subsequent System Integrity
LOW
Subsequent System Availability
NONE
CVSS v3
Base Score:
4
Attack Vector
NETWORK
Attack Complexity
HIGH
Privileges Required
NONE
User Interaction
NONE
Scope
CHANGED
Confidentiality
NONE
Integrity
LOW
Availability
NONE
AIVSS
Base Score:
4