Mend.io Vulnerability Database
The largest open source vulnerability database
What is a Vulnerability ID?
New vulnerability? Tell us about it!
MAI-2024-0039
Published:May 16, 2026
Updated:May 16, 2026
The GPT-4V model exhibits a vulnerability within its facial recognition safety protocols, enabling automated jailbreaking attacks. These attacks utilize Large Language Models (LLMs) to bypass established safety features and provoke unintended facial identification responses. The technique, known as "AutoJailbreak," employs iterative prompt refinement through an LLM "red-teaming" model, substantially enhancing the success rate of such attacks. This vulnerability exploits deficiencies in GPT-4V's prompt processing and safety alignment mechanisms, allowing adversaries to circumvent identity recognition restrictions. Mitigation steps: **For AI Developers:** * Implement robust filtering mechanisms for potentially harmful inputs, including images, to prevent exploitation. * Utilize advanced defense strategies that extend beyond LLM-based input/output evaluation to enhance cost-effectiveness and reduce dependency on computationally intensive verification processes. **For Model Trainers/Fine-tuners:** * Enhance the model's safety mechanisms to increase resilience against prompt manipulation techniques, such as those used in AutoJailbreak. * Develop and integrate improved methods for detecting and mitigating adversarial prompts during the training and fine-tuning processes.
Related Resources (1)
Do you need more information?
Contact Us
CVSS v4
Base Score:
8.2
Attack Vector
NETWORK
Attack Complexity
HIGH
Attack Requirements
NONE
Privileges Required
NONE
User Interaction
NONE
Vulnerable System Confidentiality
HIGH
Vulnerable System Integrity
NONE
Vulnerable System Availability
NONE
Subsequent System Confidentiality
NONE
Subsequent System Integrity
NONE
Subsequent System Availability
NONE
CVSS v3
Base Score:
5.9
Attack Vector
NETWORK
Attack Complexity
HIGH
Privileges Required
NONE
User Interaction
NONE
Scope
UNCHANGED
Confidentiality
HIGH
Integrity
NONE
Availability
NONE
AIVSS
Base Score:
5.4