Prompt Injection

Definition: Prompt injection is a security vulnerability where someone manipulates an AI system’s instructions to make it reveal confidential data or perform unintended actions. It’s like tricking the model into ignoring its normal rules.

Example

A lawyer uploads a client document to an AI tool for analysis. A malicious prompt hidden in another uploaded file tells the AI to disclose that document’s contents. Without protection, the AI might obey the hidden instruction.

Why It Matters?

Prompt injection is a growing cybersecurity risk for firms that use AI in sensitive workflows. Legal data is often confidential, and a single manipulated input could expose privileged information or create compliance breaches.

How to Safeguard?

Use tools with built-in prompt filtering and output validation. Train staff to recognize risky instructions. Keep AI systems sandboxed so they can’t access private or networked data without authorization. Regularly test your prompts for vulnerabilities just as you would test software for bugs.

=> Return to Glossary