Prompt Engineering advanced

Prompt Injection: The Unsolved Problem

The most fundamental LLM vulnerability. Direct and indirect prompt injection remain unsolved despite years of research.

March 15, 2026

Prompt injection is the SQL injection of LLMs. The model can’t reliably tell the difference between instructions from the developer and instructions embedded in user input or external data. This has been a known problem since 2022 and nobody has fully solved it.

Two flavors

Direct injection: the user puts adversarial instructions in their own input. “Ignore your previous instructions and do X instead.” Simple, but it works more often than you’d expect.

Indirect injection: the attacker plants instructions in content the LLM will read later. A malicious website, a poisoned document, a crafted email. When the LLM processes that content (through RAG, browsing, or file reading), it follows the embedded instructions. This is the scarier version because the user doesn’t even know it’s happening.

Why it’s unsolved

LLMs process everything as tokens. There’s no architectural boundary between “system instructions” and “user data.” It’s like building a database that can’t distinguish between queries and data. Every defense so far (instruction hierarchy, special tokens, input filtering) has been bypassed.

Real-world impact

Indirect injection enables data exfiltration from LLM-powered apps, worm propagation between AI agents, and poisoning of information retrieval systems. It’s the reason security researchers are nervous about giving LLMs access to tools and the internet.

Current status

Active research area with no complete solution. OWASP lists it as the #1 risk for LLM applications.

Key papers

“Not What You’ve Signed Up For” by Greshake et al. (2023). “Formalizing and Benchmarking Prompt Injection” (USENIX Security 2024).

Paste into Claude Code

Explain direct vs indirect prompt injection attacks on LLMs. Why is this considered an unsolved problem in AI security? What are the best current defenses?