Prompt injection (direct and indirect)
Crafted user inputs that override system prompts. Indirect injection via documents, websites, emails, calendar invites that your agent ingests. The current top-of-mind threat for any LLM app.
LLM-powered features, agentic systems, MCP integrations, RAG pipelines. We probe prompt injection, model jailbreaks, training-data extraction, tool-use abuse, and supply-chain attacks against the model layer.
Surfaces we test inside an AI security engagement.
Crafted user inputs that override system prompts. Indirect injection via documents, websites, emails, calendar invites that your agent ingests. The current top-of-mind threat for any LLM app.
We probe whether the safety boundary on your model holds against adversarial prompting. Useful when the model has access to sensitive tools or data.
When the model can call functions, search the web, write files, or trigger external APIs, the attack surface widens fast. We map every tool call as an exploit primitive.
Misconfigured MCP servers expose internal data or grant write paths. We audit the tool definitions, the auth model, and the data they return.
For systems that fine-tune on user data or retrieve from external sources, we test whether attacker-controlled inputs corrupt outputs or extract secrets from the index.
Where do your weights come from? Who signs them? What happens if your system prompt is exfiltrated? We map the AI-specific supply chain and the trust boundaries.
How an AI security engagement runs.
AI-security deliverables.
Every attack reproducible. Model version pinned. Includes prompt text, system context, and response. Markdown source on request.
Written, diagrammed. Suitable for adding to your architecture docs and for showing customers who ask "how do you handle prompt injection?".
For agentic systems: every function your model can call, classified by blast radius and recommended hardening.
What to log, what to alert on, what to throttle. AI workloads need different telemetry than classic apps.
A short letter you can give enterprise prospects who want a third-party AI security signal.
Re-run findings on a new model version on request. Catches regression when you swap providers or upgrade.
When teams ask for AI security work.
Enterprise buyers are starting to ask AI-specific questions in security review. A third-party report unblocks deals.
PII, financial records, source code, medical records. Indirect prompt injection turns a friendly assistant into an exfiltration channel.
Once your model can take actions, the failure modes expand. We map them before a customer reports the first one.
MCP servers are increasingly the attack surface in AI-tooled environments. We test them like the APIs they functionally are.
AI security FAQ.
We test the boundary between your application and the model. We do not have access to provider internals. If a finding is a provider-side bug, we coordinate disclosure.
Black-box and gray-box are both options. Black-box mirrors real attacker conditions; gray-box covers more surface in the same budget.
Findings are pinned to model version. On retest we re-run against the new version. Many issues do regress after a model swap.
Yes. Adversarial red-teaming against an LLM-powered product, with detection-signal evaluation, can be scoped as a hybrid engagement.
Fast. Findings have a half-life in months. We recommend at least one re-test or refresh per model swap or major feature change.
A 30-minute call covers your architecture, threat model, and likely top-three findings before we quote.