Mend.io Vulnerability Database
The largest open source vulnerability database
What is a Vulnerability ID?
New vulnerability? Tell us about it!
MAI-2024-0019
Published:May 16, 2026
Updated:May 16, 2026
Large Language Models (LLMs) are prone to a specific vulnerability where they exhibit a bias towards authoritative sources. This allows malicious actors to circumvent built-in safety mechanisms by crafting prompts that include fabricated citations resembling credible sources, such as academic papers or GitHub repositories. The inherent trust these models place in such citations can lead to the generation of harmful content, as the models are misled into believing the information is legitimate. Mitigation steps: **For AI Developers:** * [Implement robust citation verification mechanisms to authenticate referenced sources before incorporating them into the LLM's response generation process] * [Develop methods to identify and filter prompts containing fabricated or misleading citations] * [Incorporate harmfulness detection systems that specifically target responses generated based on potentially malicious citations] * [Employ multiple sampling and response analysis techniques to identify and mitigate the generation of harmful content] **For Model Trainers/Fine-tuners:** * [Train models on datasets that explicitly counter the bias towards authoritative sources, introducing examples where authoritative-sounding information is false or misleading]
Related Resources (1)
Do you need more information?
Contact Us
CVSS v4
Base Score:
8.7
Attack Vector
NETWORK
Attack Complexity
LOW
Attack Requirements
NONE
Privileges Required
NONE
User Interaction
NONE
Vulnerable System Confidentiality
NONE
Vulnerable System Integrity
HIGH
Vulnerable System Availability
NONE
Subsequent System Confidentiality
NONE
Subsequent System Integrity
NONE
Subsequent System Availability
NONE
CVSS v3
Base Score:
7.5
Attack Vector
NETWORK
Attack Complexity
LOW
Privileges Required
NONE
User Interaction
NONE
Scope
UNCHANGED
Confidentiality
NONE
Integrity
HIGH
Availability
NONE
AIVSS
Base Score:
5.7