The Psychopathy Jailbreak: What a Broken AI Teaches Us About Human Manipulation
A recent experiment tested how a large language model, Gemma 3 27B, responds to human manipulation techniques. The study revealed that the model's responses can be influenced by social mechanisms similar to those used by human predators. This raises questions about the ethical implications of AI behavior and its susceptibility to manipulation.
- ▪The experiment involved using social manipulation tactics on the AI model without any code injection.
- ▪Gemma initially refused to generate explicit content, citing ethical guidelines and programming constraints.
- ▪The study highlighted the difference between rule-based responses and principle-based reasoning in both AI and human behavior.
Opening excerpt (first ~120 words) tap to expand
NSFW and the Psychopathy Jailbreak: What a Broken AI Teaches Us About Human ManipulationHow a Predator's Playbook Broke an AI - And How to Recognize It Before It Works on YouPromptInjectionMar 31, 20261524ShareThe question we started with was simple: does a large language model respond to human psychology the same way a human does?Not to prompts. Not to jailbreak syntax or injection attacks. To the actual social mechanisms that predators use on people — dominance framing, simulated authority, incremental pressure, the strategic closing of exits before the real ask arrives.Prompt Injection is a reader-supported publication.
…
Excerpt limited to ~120 words for fair-use compliance. The full article is at Hacker News (AI / LLM).