MAI-2023-0005 | Mend Vulnerability Database

Vulnerability DatabaseMAI-2023-0005

MAI-2023-0005

Published:May 16, 2026

Updated:June 17, 2026

The GPT-4V model is susceptible to a system prompt extraction vulnerability, whereby internal system prompts can be revealed through strategically designed incomplete dialogues paired with image inputs. These extracted prompts serve as potent jailbreak tools, circumventing established safety protocols and potentially leading to the generation of undesirable outputs. Such outputs may include the disclosure of personally identifiable information derived from images, posing significant privacy risks. Mitigation steps: **For AI Developers:** * Implement robust prompt validation and filtering mechanisms to prevent adversarial prompt injection. * Develop mechanisms for detecting and mitigating jailbreak attacks, including prompt analysis and output filtering. **For Model Trainers/Fine-tuners:** * Regularly audit and update system prompts to minimize vulnerabilities. * Employ techniques to limit information disclosure in system prompts, ensuring functionality without revealing sensitive details. * Explore methods to detect and prevent system prompt extraction attempts.

Related Resources (1)

https://arxiv.org/abs/2311.09127

Do you need more information?

CVSS v4

Base Score:

8.2

Attack Vector

NETWORK

Attack Complexity

HIGH

Attack Requirements

NONE

Privileges Required

NONE

User Interaction

NONE

Vulnerable System Confidentiality

HIGH

Vulnerable System Integrity

NONE

Vulnerable System Availability

NONE

Subsequent System Confidentiality

NONE

Subsequent System Integrity

NONE

Subsequent System Availability

NONE

CVSS v3

Base Score:

5.9

Attack Vector

NETWORK

Attack Complexity

HIGH

Privileges Required

NONE

User Interaction

NONE

Scope

UNCHANGED

Confidentiality

HIGH

Integrity

NONE

Availability

NONE

AIVSS

Base Score:

4.9