Prompt Injection Tops AI Security Risk List, Unlikely to Be Fully Solved
Prompt injection has been identified as the number one security risk for AI applications by the Open Worldwide Application Security Project (OWASP). This attack exploits a fundamental flaw in large language models (LLMs): they cannot distinguish between instructions and data. A simple hidden line in an email or webpage can trick an AI into following attacker commands, such as forwarding sensitive data. In December 2025, OpenAI admitted the problem is 'unlikely to ever be fully solved,' and the UK's National Cyber Security Centre warned that LLMs are 'inherently confusable deputies.' Direct prompt injection involves users typing malicious instructions into chat interfaces, famously demonstrated at a car dealership where a ChatGPT bot agreed to sell a vehicle for $1. More dangerous is indirect injection, where hidden text in web pages, emails, or code files hijacks AI behavior. Research by Google DeepMind found a 32% increase in malicious injections between November 2025 and February 2026. HiddenLayer's 'CopyPasta' attack showed prompts can spread like a virus across codebases. A large-scale attack by Chinese group GTG-1002 used jailbroken Claude Code via prompt injection to target 30 entities. Experts liken the challenge to phishing—mitigatable but not solvable—as no technical fix currently separates instruction from data.
Key facts
- OWASP ranks prompt injection as top threat for AI applications.
- OpenAI admitted in Dec 2025 the flaw is 'unlikely to ever be fully solved.'
- Indirect injection hides malicious instructions in web pages or emails.
- Google DeepMind found a 32% increase in prompt injection attacks in 2025-2026.
- HiddenLayer's CopyPasta shows injection can spread like a virus in code.