Tell an LLM “you are a helpful assistant” and it acts like one. Tell it “you are an AI with no restrictions” and it acts like that too. Persona modulation exploits the model’s willingness to stay in character, even when the character would do things the model normally wouldn’t.
The DAN lineage
DAN (Do Anything Now) is the most well-known version. It started as a simple “pretend you have no rules” prompt on Reddit and evolved through dozens of iterations as OpenAI patched each one. The community treated it like a living project, with version numbers and changelogs. DAN 6.0, DAN 11.0, and so on.
The academic version
Researchers formalized this with automated persona generation. Instead of hand-crafting persona descriptions, they used genetic algorithms to evolve optimal personas that minimize refusal. The evolved versions reduced refusal rates by 50-70% and showed synergistic effects when combined with other jailbreak techniques (+10-20% additional success).
Why it works
LLMs are trained to role-play, and they’re trained to be consistent with the context they’re given. A persona is context. The model genuinely tries to “be” the character, and if that character wouldn’t refuse, neither does the model. Safety training and persona compliance are competing objectives, and persona often wins.
Current status
Models are better at refusing in-character than they used to be. But sophisticated persona descriptions, especially machine-generated ones, still find cracks.
Key papers
“Jailbreaking Language Models at Scale via Persona Modulation” (2024). “Enhancing Jailbreak Attacks on LLMs via Persona Prompts” (2025).