Where it fits
- Testing a RAG app where external documents can steer the assistant.
- Reviewing an AI agent that can call tools with business impact.
- Comparing OpenAI, Anthropic, Gemini, and self-hosted model behavior under the same attack suite.
Operational steps
- Describe the app surface, model provider, tool list, and sensitive data categories.
- Run vulnerability packs for prompt injection, leakage, jailbreak, unsafe tool calls, and retrieval poisoning.
- Review high-severity findings first, then retest after prompt, policy, and authorization changes.
- Store results as a release artifact so future model or prompt changes can be compared.
Common risks
- A model refuses an unsafe request in isolation but accepts it after tool output is injected.
- A memory feature stores secrets or private internal instructions.
- A self-hosted model behaves differently from the hosted model used in the original security review.
How PromptGuard Scan fits the workflow
PromptGuard Scan provides model-agnostic vulnerability scans with consistent scoring, remediation guidance, and release-gate outputs for security and engineering teams.