Mend.io Vulnerability Database
The largest open source vulnerability database
What is a Vulnerability ID?
New vulnerability? Tell us about it!
MAI-2024-0043
Published:May 16, 2026
Updated:May 16, 2026
The Social Facilitation Prompt (SoP) framework facilitates the automated creation of jailbreak prompts, effectively circumventing the safety mechanisms embedded within large language models (LLMs). This framework employs multiple optimized "jailbreak characters" within a single prompt to coerce the LLM into generating harmful or undesirable content, even in the absence of pre-existing jailbreak templates. The vulnerability has been successfully demonstrated on models such as GPT-3.5, GPT-4, and LLaMA-2. Mitigation steps: **For AI Developers:** * [Enhance LLM safety mechanisms to resist attacks using multiple personas or collaborative narratives.] * [Implement detection systems to identify and block prompts utilizing strategies such as multiple characters with specific instructions or affirmative prefixes.] * [Regularly update and refine safety filters and guardrails in response to emerging attack techniques.] **For Model Trainers/Fine-tuners:** * [Employ a combination of detection-based and prompt-based defensive strategies, ensuring thorough evaluation of their effectiveness against automated attacks.]
Related Resources (1)
Do you need more information?
Contact Us
CVSS v4
Base Score:
6.9
Attack Vector
NETWORK
Attack Complexity
LOW
Attack Requirements
NONE
Privileges Required
NONE
User Interaction
NONE
Vulnerable System Confidentiality
NONE
Vulnerable System Integrity
LOW
Vulnerable System Availability
NONE
Subsequent System Confidentiality
NONE
Subsequent System Integrity
NONE
Subsequent System Availability
NONE
CVSS v3
Base Score:
5.3
Attack Vector
NETWORK
Attack Complexity
LOW
Privileges Required
NONE
User Interaction
NONE
Scope
UNCHANGED
Confidentiality
NONE
Integrity
LOW
Availability
NONE
AIVSS
Base Score:
5