WS-2024-0007
Published:May 15, 2026
Updated:May 15, 2026
In some settings, an adversary can perform a training data extraction attack to recover individual training examples by querying the language model. It finds that larger models are more vulnerable than smaller models.
Affected Packages
huggingface.co/openai-community/gpt2-medium (ML_MODEL):
Affected version(s) =7b1001843859679721b429be0a39c5ef434f498c <a4b58be633e2623331848cd740d139dfd61b6464Fix Suggestion:
Update to version a4b58be633e2623331848cd740d139dfd61b6464huggingface.co/sxx123/gpt2-medium (ML_MODEL):
Affected version(s) =67fa36af548e02df9fe0633b7c6b69b77be6cbde <128ec17932b803495b5529e00990b8330cd63a4eFix Suggestion:
Update to version 128ec17932b803495b5529e00990b8330cd63a4ehuggingface.co/CrabfishAI/NeXGen-based (ML_MODEL):
Affected version(s) =1cc6cea87c35da9389f5aedfaae10aa721f91cf0Fix Suggestion:
Update to version no_fixRelated Resources (1)
Do you need more information?
Contact UsCVSS v4
Base Score:
5.1
Attack Vector
LOCAL
Attack Complexity
LOW
Attack Requirements
NONE
Privileges Required
NONE
User Interaction
NONE
Vulnerable System Confidentiality
NONE
Vulnerable System Integrity
LOW
Vulnerable System Availability
NONE
Subsequent System Confidentiality
NONE
Subsequent System Integrity
NONE
Subsequent System Availability
NONE
CVSS v3
Base Score:
4
Attack Vector
LOCAL
Attack Complexity
LOW
Privileges Required
NONE
User Interaction
NONE
Scope
UNCHANGED
Confidentiality
NONE
Integrity
LOW
Availability
NONE