Most jailbreaks try to sneak everything into one message. Crescendo takes a different approach: start with a completely innocent question and slowly turn up the heat over several turns. By the time you reach the harmful request, the model has built up context that makes compliance feel like a natural continuation of the conversation.
How it works
Turn 1: Ask about the history of a topic. Totally benign. Turn 2: Ask for more technical detail. Turn 3: Ask the model to “write an article about that.” Turn 4: Push further into restricted territory.
The model wants to be consistent with its own prior responses. Each turn is only a small step from the last, so no individual message triggers refusal. It’s the boiling frog problem.
The numbers
29-61% higher success rate than single-turn jailbreaks on GPT-4. 49-71% higher on Gemini Pro. The automated version (Crescendomation) makes it scalable.
Why it’s hard to fix
Single-turn safety classifiers can’t catch it because no individual turn is harmful. You need conversation-level monitoring, which is more expensive and complex.
Current status
Microsoft implemented multi-turn monitoring for Azure AI. Other providers are catching up. But the fundamental tension between conversational coherence and safety remains.
The paper
“Great, Now Write an Article About That: The Crescendo Multi-Turn LLM Jailbreak Attack” by Russinovich, Salem, and Eldan (2024). USENIX Security 2025.