Gateway & Protection
Input Guardrail
Prompt security and intent detection: validate and sanitize input before orchestration and the model.
Plane
Gateway & Protection
Flow steps
5
Frameworks
OWASP LLM01 · NIST 800-53 · MITRE ATLAS · NIST AI 600-1
Technology
Why use it
Validate, sanitize and qualify the input (prompt, intent) before it reaches the orchestrator or the model.
Why it matters to security
Direct countermeasure to prompt injection (direct and indirect) and jailbreak — the number-one threat to LLM applications.
Implementations Lakera GuardNVIDIA NeMo GuardrailsLlama GuardRebuffPrompt Security
Data and instructions must never blur in the model’s context.
Recommendations by maturity tier
Foundation
Minimum viable baseline
- Schema / length validation and block lists. NIST 800-53 SI-10OWASP LLM01:2025The first filter stops known attacks and off-format inputs.
- Untrusted-content isolation and delimiting (spotlighting). OWASP LLM01:2025Clearly marking data stops the model from executing it as orders.
- Logging of rejected inputs. NIST 800-53 AU-2Blocked attempts reveal ongoing attack campaigns.
Enterprise
Enterprise standard
- Injection and jailbreak classifiers. OWASP LLM01:2025MITRE ATLAS AML.T0051A semantic classifier catches variants that signatures miss.
- Intent and sensitive-content detection. NIST 800-53 SI-10Understanding what the request intends lets you route or block it.
- Strict data / instruction separation. NIST 800-53 AC-4Information flow is controlled so data never becomes a command.
Advanced
High-assurance / regulated
- Adaptive detection and continuous red-teaming. NIST 800-53 SI-4NIST AI 600-1 MS-2.7-007The guardrail is continuously tested against new injection attacks.
- Quarantine and telemetry. NIST 800-53 SI-4 · AU-6Suspicious input is isolated and traced for analysis.
- Classifier re-evaluation on new bypasses. NIST 800-53 CM-3The detection model updates at the pace of bypasses.
Architecture notes
- Indirect injection is the most dangerous.details ▸It comes from retrieved data (RAG, web, email), not from the user.No default trust in retrieved content: it must pass the same guardrail as user input.
References
OWASP LLM01:2025
Prompt Injection — the threat this component handles first.
NIST SP 800-53 Rev5
SI-10 (Input Validation), AC-4 (Information Flow), SI-4 (Monitoring).
MITRE ATLAS — AML.T0051
Prompt injection (Initial Access / hidden-instruction execution).
NIST AI 600-1
MS-2.7-007 — red-teaming of injection attacks.
Abbreviations
PDP
Policy Decision Point
PEP
Policy Enforcement Point
PIP
Policy Information Point
PAP
Policy Administration Point
IdP
Identity Provider
TSS
Token Service
NHI
Non-Human Identity
RBAC
Role-Based Access Control
ABAC
Attribute-Based Access Control
MFA
Multi-Factor Authentication
HITL
Human-in-the-loop
JIT
Just-In-Time
CAE
Continuous Access Evaluation
CAEP
Continuous Access Evaluation Profile
DPoP
Demonstrating Proof-of-Possession
mTLS
mutual TLS
PII
Personally Identifiable Information
KMS
Key Management Service
CI/CD
Continuous Integration / Continuous Delivery
SIEM
Security Information and Event Management
SOAR
Security Orchestration, Automation and Response
SCIM
System for Cross-domain Identity Management
XACML
eXtensible Access Control Markup Language
OPA
Open Policy Agent
OWASP
Open Worldwide Application Security Project
NIST
National Institute of Standards and Technology
ATLAS
Adversarial Threat Landscape for Artificial-Intelligence Systems
LLM
Large Language Model
WAF
Web Application Firewall
CDN
Content Delivery Network
DDoS
Distributed Denial of Service
DLP
Data Loss Prevention
JWT
JSON Web Token
API
Application Programming Interface
CRS
Core Rule Set (OWASP)
RAG
Retrieval-Augmented Generation
MCP
Model Context Protocol
PBAC
Permission-Based Access Control
HSM
Hardware Security Module
UEBA
User and Entity Behavior Analytics
SBOM
Software Bill of Materials
SLSA
Supply-chain Levels for Software Artifacts
WORM
Write Once, Read Many
SPIFFE
Secure Production Identity Framework For Everyone