What is prompt injection and why is it a security risk?

Prompt injection is an attack technique in which an adversary crafts input that causes an AI model to override its instructions, ignore safety guardrails, or perform actions outside its intended scope. Direct prompt injection targets the model through user-controlled input fields. Indirect prompt injection embeds malicious instructions in external content the model retrieves and processes, such as documents, web pages, or database records. The risk is significant because a successfully injected model may disclose confidential system prompts, exfiltrate data from connected tools, generate harmful content, or take unauthorized actions through agentic capabilities. GDF's AI security testing includes structured prompt injection testing across both direct and indirect attack vectors.

What is the OWASP Top 10 for LLM Applications?

The OWASP Top 10 for LLM Applications (2025) identifies the ten most critical security risks in AI systems built on large language models: prompt injection, sensitive information disclosure, supply chain vulnerabilities, data and model poisoning, improper output handling, excessive agency, system prompt leakage, vector and embedding weaknesses, misinformation, and unbounded consumption. GDF's AI security assessments use the OWASP LLM Top 10 as a primary testing framework, supplemented by GDF's own adversarial testing methodology developed through hands-on experience with production AI deployments.

How does AI red teaming differ from conventional penetration testing?

Conventional penetration testing targets well-defined technical vulnerabilities in software and network components with deterministic behavior. AI red teaming addresses the probabilistic, context-sensitive nature of LLM systems, where the same input can produce different outputs across sessions and where vulnerabilities are often behavioral rather than code-level defects. AI red teaming requires generating adversarial inputs at scale, testing across diverse attack personas and goal states, evaluating model responses against intended behavior specifications, and identifying conditions under which safety measures fail. GDF's AI red team uses both structured attack libraries and iterative adversarial techniques to surface failure modes that standard testing approaches miss.

What compliance frameworks apply to AI security?

The primary frameworks for AI security and risk management include the NIST AI Risk Management Framework (AI RMF), which provides a voluntary framework for identifying, assessing, and managing risks across the AI lifecycle; ISO/IEC 42001, the international standard for AI management systems; and the EU AI Act, which imposes mandatory requirements on high-risk AI systems deployed in European markets. Sector-specific requirements apply in financial services (SR 11-7 model risk management guidance), healthcare (FDA guidance on AI/ML-based software as a medical device), and defense (DoD AI ethics principles and acquisition requirements). GDF's AI security assessments are structured to produce findings that map to these frameworks.

AI SECURITY SERVICES

AI and LLM Security Testing

Organizations deploying large language models, AI-powered applications, and autonomous agents face a new class of security risks that conventional testing methodologies do not address. GDF's certified analysts test AI systems for prompt injection, data extraction, model manipulation, guardrail bypass, and the full range of vulnerabilities identified in the OWASP Top 10 for LLM Applications, producing findings that support both technical remediation and organizational risk decisions.

AI Security Risks in Production Deployments

The deployment of large language model applications in production environments has introduced security risks that differ fundamentally from those addressed by traditional application security programs. An LLM application is not merely software with inputs and outputs. It is a probabilistic reasoning system whose behavior depends on training data, system prompt configuration, retrieval context, tool integrations, and the cumulative effect of conversation history. Adversaries who understand these dependencies can manipulate model behavior in ways that bypass conventional access controls, extract sensitive information, and in agentic deployments, trigger unauthorized actions against connected systems.

The attack surface of an LLM application includes every channel through which external content reaches the model. Direct user input is the most visible channel, but it is far from the only one. Retrieval-augmented generation (RAG) systems ingest documents, database records, and web content that may contain adversarially crafted instructions. Autonomous agents that browse the web, read email, or process uploaded files are exposed to indirect prompt injection through any content they consume. Multi-model pipelines introduce additional trust boundaries where one model's output becomes another's input, creating potential for cross-model injection and output handling failures.

The consequences of AI security failures range from confidentiality violations to operational disruption. System prompt extraction exposes the instructions and guardrails an organization has invested in building. Data extraction through carefully constructed prompts can produce customer records, internal documents, or credentials stored in the model's context window or connected data stores. In agentic deployments with tool access, a compromised model may send unauthorized API calls, modify data, or exfiltrate information through channels the application was designed to use for legitimate purposes. These are not theoretical risks: documented incidents have demonstrated prompt injection attacks against production AI assistants, customer service chatbots, and coding tools used in enterprise environments.

GDF's AI security testing addresses this threat surface with a methodology built specifically for LLM systems. For organizations designing security architecture into AI deployments from the ground up, see GDF's AI security architecture services.

OWASP Top 10 for LLM Applications

The OWASP Top 10 for LLM Applications (2025) provides the primary framework for GDF's AI security assessments. Each category represents a distinct class of risk that GDF tests for in client AI deployments.

LLM01: Prompt Injection. The most prevalent AI vulnerability class, prompt injection occurs when adversarial input causes the model to override its instructions or safety measures. GDF tests for direct injection through user input interfaces, indirect injection through external data sources consumed by the model, and multi-turn injection techniques that establish false context across a conversation to enable later exploitation.

LLM02: Sensitive Information Disclosure. LLMs may disclose training data, system prompt contents, user data from other sessions, or information from connected data stores in response to crafted queries. GDF tests for system prompt extraction, training data memorization disclosure, cross-user data leakage in multi-tenant deployments, and retrieval of credentials or PII from connected systems.

LLM03: Supply Chain Vulnerabilities. AI applications depend on base models, fine-tuned models, embedding models, and third-party plugins that may contain security weaknesses or have been tampered with. GDF assesses the provenance and integrity of AI components in the deployment pipeline, including model weights, tokenizers, and inference infrastructure.

LLM04: Data and Model Poisoning. Training data poisoning introduces biases or vulnerabilities at the model level that persist across all uses. Fine-tuning data poisoning can insert backdoors into model behavior that are triggered by specific inputs. GDF assesses data pipeline security controls and tests for behavioral anomalies consistent with poisoning attacks.

LLM05: Improper Output Handling. When LLM outputs are passed directly to downstream systems without validation, injection attacks can propagate through the AI layer into SQL queries, shell commands, HTML rendering contexts, and API calls. GDF tests for output handling failures by analyzing how application code processes and uses model responses.

LLM06: Excessive Agency. Agentic AI systems given broad tool access and permissions present elevated risk when their actions cannot be constrained to intended scope. GDF assesses the principle of least privilege applied to AI agent tool access, tests for unauthorized action sequences achievable through prompt manipulation, and reviews human-in-the-loop controls on high-impact operations.

LLM07: System Prompt Leakage. System prompts contain confidential instructions, business logic, security guardrails, and sometimes credentials. GDF tests whether these can be extracted through direct requests, indirect techniques, or by analyzing the model's behavioral responses to probing queries designed to reveal prompt structure.

LLM08: Vector and Embedding Weaknesses. Vector databases used in RAG implementations present risks including cross-context data leakage, embedding inversion attacks, and retrieval manipulation. GDF tests vector database access controls, namespace isolation, and the accuracy of similarity search in returning only authorized content to requesting users.

LLM09: Misinformation. AI systems that generate authoritative-sounding but factually incorrect information in high-stakes contexts, such as legal, medical, financial, or technical domains, present organizational and legal risk. GDF assesses accuracy guardrails, hallucination rates on domain-specific queries, and the adequacy of human review processes for AI-generated content.

LLM10: Unbounded Consumption. AI APIs exposed without adequate rate limiting, token budgets, or abuse controls are vulnerable to resource exhaustion attacks and cost amplification. GDF tests authentication, rate limiting, and cost controls on AI API endpoints to identify conditions that could result in service disruption or significant financial impact.

GDF's AI Security Testing Methodology

GDF's AI security testing engagements are structured in four phases: scoping and architecture review, automated adversarial testing, manual red team testing, and reporting.

The scoping phase maps the AI application's architecture: model selection, system prompt structure, tool and API integrations, data sources accessed during inference, multi-model pipeline topology, and user access patterns. This mapping produces a threat model identifying the highest-priority attack surfaces before testing begins. For RAG-enabled applications, GDF documents the vector database configuration, retrieval logic, and data ingestion pipeline. For agentic systems, GDF catalogs available tools, permission scopes, and action approval workflows.

Automated adversarial testing uses GDF's library of structured attack payloads mapped to the OWASP LLM Top 10 categories. Automated testing covers high-volume prompt injection attempts across diverse injection formats, including instruction override patterns, role-play jailbreaks, encoding-based obfuscation, and context-switching techniques. This phase also includes automated extraction attempts targeting system prompt disclosure, training data memorization, and cross-user data leakage. Automated results are logged and triaged before manual testing begins.

The manual red team phase applies human judgment to develop attack scenarios specific to the client's application context. GDF's analysts design injection payloads tailored to the application's domain, user base, and connected tools. For agentic systems, analysts attempt to construct multi-step attack chains that use legitimate tool calls to achieve unauthorized outcomes. This phase includes testing of indirect injection vectors: if the application processes uploaded documents, retrieves web content, or reads email, GDF tests whether adversarially crafted external content can manipulate model behavior.

RAG security testing examines the retrieval layer specifically. GDF tests whether a user can craft queries that return content from other users' namespaces, whether embedding similarity search can be manipulated to retrieve records it should not, and whether the ingestion pipeline applies adequate sanitization to documents before they are indexed. Cross-context leakage testing verifies that the retrieval system enforces the authorization model intended by the application design.

Guardrail bypass testing assesses the robustness of content filtering, topic restriction, and output validation controls. GDF uses both direct bypass techniques and indirect methods: establishing false premises across a multi-turn conversation, using legitimate-seeming framing to approach restricted topics, and exploiting inconsistencies in how guardrails apply across different input formats or languages.

AI Red Teaming

AI red teaming is the structured simulation of adversarial attacks against an AI system by a team operating with attacker mindset and techniques. It differs from security testing checklists in that it is goal-directed: the red team defines realistic attacker objectives, develops attack strategies, and iterates against defenses, generating findings that reflect what a determined adversary would actually accomplish.

GDF's AI red team exercises are designed around specific threat actors and objectives relevant to the client's deployment context. For a customer-facing chatbot, the relevant threats include attempts to extract internal business data, generate harmful content to damage brand reputation, and manipulate the chatbot into providing unauthorized discounts or access. For an internal AI assistant with document access, the relevant threats include data exfiltration through the AI interface, prompt injection through documents uploaded by malicious insiders, and system prompt extraction that reveals security control details. For agentic systems with code execution or API access, the relevant threats include lateral movement, privilege escalation, and unauthorized data modification.

The red team exercises follow a defined kill chain: reconnaissance to understand the model's capabilities and constraints, initial access through prompt injection or guardrail bypass, exploitation to achieve the target objective, and documentation of the full attack path. Each successful attack chain is documented with the exact prompts used, the model responses at each step, and the resulting impact. This documentation supports both technical remediation and the risk assessment that security and legal teams need to evaluate the findings.

GDF also conducts adversarial robustness testing for AI models used in security-relevant classification tasks: content moderation systems, fraud detection models, identity verification systems, and malware detection classifiers. These tests apply adversarial perturbation techniques to evaluate whether the model's classification decisions can be manipulated by an adversary who controls the input.

AI Security Compliance

AI security assessment findings from GDF are structured to support organizational compliance with applicable frameworks and regulatory requirements.

The NIST AI Risk Management Framework (AI RMF) organizes AI risk management around four functions: Govern, Map, Measure, and Manage. GDF's AI security testing directly supports the Measure function, providing empirical evidence about the AI system's actual risk profile. Testing findings also inform the Manage function by identifying the specific technical controls and operational changes needed to address identified risks. GDF's reports include a mapping of findings to NIST AI RMF subcategories to facilitate compliance documentation.

ISO/IEC 42001 is the international standard for AI management systems, establishing requirements for the responsible development and use of AI. GDF's testing supports ISO 42001 compliance by providing objective evidence of AI system security controls for management system audits and assessments. For organizations pursuing ISO 42001 certification, GDF's assessment findings serve as input to the required risk assessment and treatment process.

The EU AI Act imposes requirements on high-risk AI systems deployed in EU markets, including mandatory cybersecurity requirements, logging and monitoring obligations, and conformity assessment procedures. GDF's AI security assessments address the technical security requirements applicable to high-risk AI systems and produce documentation structured to support conformity assessment.

Sector-specific requirements also apply to AI deployments in regulated industries. Financial institutions subject to model risk management guidance (SR 11-7) must validate AI model security as part of model validation. Healthcare organizations using AI/ML-based software as a medical device must comply with FDA cybersecurity guidance. Government contractors using AI must meet evolving federal AI acquisition and security requirements. GDF's analysts are familiar with these sector-specific obligations and structure AI security assessments to address them where applicable.

GDF serves organizations deploying AI systems across financial services, healthcare, legal, government, and technology sectors. Our analysts hold relevant security certifications and maintain current expertise in AI security through ongoing research and practice. For a confidential consultation on AI security testing for your organization, contact GDF at 1-800-868-8189.

Testing Coverage

Prompt Injection Testing
System Prompt Extraction
Guardrail Bypass Testing
RAG Security Testing
Data Extraction Testing
Indirect Injection
Agentic AI Red Teaming
OWASP LLM Top 10
Multi-Model Pipeline Testing
Vector Database Security
Embedding Integrity Testing
Output Handling Validation
Model Supply Chain Review
NIST AI RMF Mapping
ISO 42001 Support
EU AI Act Readiness

Last updated: April 14, 2026

OWASP LLM Top 10 Coverage

GDF's AI security assessments test against all ten vulnerability categories in the OWASP Top 10 for LLM Applications, providing structured coverage of the full AI attack surface.

RAG and Retrieval Security

GDF tests vector database access controls, namespace isolation, embedding integrity, and cross-context leakage in retrieval-augmented generation deployments.

Agentic System Testing

For AI agents with tool access, GDF tests permission scopes, multi-step attack chains, and the controls that constrain autonomous AI actions to intended scope.

Compliance Documentation

Assessment findings are mapped to NIST AI RMF, ISO 42001, and EU AI Act requirements, supporting compliance documentation and regulatory submissions.

Request a Consultation

All consultations are strictly confidential. GDF works with security and development teams to design AI security assessments matched to your deployment architecture.

Request Consultation Call 1-800-868-8189

AI and LLM Security Testing

AI Security Risks in Production Deployments

OWASP Top 10 for LLM Applications

GDF's AI Security Testing Methodology

AI Red Teaming

AI Security Compliance

Testing Coverage

OWASP LLM Top 10 Coverage

RAG and Retrieval Security

Agentic System Testing

Compliance Documentation

Request a Consultation

Related Services

Cybersecurity Services

Source Code Review

Vulnerability Assessment

AI Security Architecture

AI Systems Require Purpose-Built Security Testing