Orchestration
Agent Guardrail
The agent’s guardrail: it constrains, in real time, the actions and tool calls decided by the orchestrator.
Plane
Orchestration
Flow steps
6
Frameworks
OWASP LLM06 · NIST 800-53 · MITRE ATLAS
Technology
Why use it
Enforce action rules on the agent at execution time, regardless of what the model “decides”.
Why it matters to security
Last barrier before an agent action touches a tool or data: it enforces least privilege action by action.
Implementations NVIDIA NeMo GuardrailsGuardrails AIInvariantcustom PEP rules
Don’t trust the model to self-limit: constrain it from outside.
Recommendations by maturity tier
Foundation
Minimum viable baseline
- Tool-call validation against policy. NIST 800-53 AC-3Every tool call is checked before execution.
- Validated and sanitized tool parameters. NIST 800-53 SI-10An unvalidated tool argument is a latent injection.
- Logging of every action. NIST 800-53 AU-2 · AU-12Every agent action must be attributable.
Enterprise
Enterprise standard
- Least privilege enforced per action. NIST 800-53 AC-6OWASP LLM06:2025Rights are granted to the action, not to the agent wholesale.
- Confirmation required for high-impact actions. NIST 800-53 AC-3Delete, pay, external send: demand an extra guarantee.
- Frequency and scope caps. NIST 800-53 SC-5Bounding frequency limits the scale of abuse.
Advanced
High-assurance / regulated
- Action decision delegated to the PDP in context. NIST 800-53 AC-24Permission to act depends on current risk, not a frozen right.
- Detection and blocking of dangerous sequences. NIST 800-53 SI-4 · AC-4Individually harmless actions are blocked once combined.
- Agent quarantine on anomaly. NIST 800-53 AC-12A suspicious agent is suspended automatically.
Architecture notes
- Make the dangerous action impossible, not merely tedious.details ▸A confirmation can be bypassed; an absent capability cannot.Prefer removing the dangerous tool from the agent over adding friction.
References
OWASP LLM06:2025
Excessive Agency — the guardrail enforces the limit at execution.
NIST SP 800-53 Rev5
AC-3, AC-6, AC-24, SI-10, AC-4, SC-5.
MITRE ATLAS — AML.T0051
Prompt injection leading to unintended agent actions.
Abbreviations
PDP
Policy Decision Point
PEP
Policy Enforcement Point
PIP
Policy Information Point
PAP
Policy Administration Point
IdP
Identity Provider
TSS
Token Service
NHI
Non-Human Identity
RBAC
Role-Based Access Control
ABAC
Attribute-Based Access Control
MFA
Multi-Factor Authentication
HITL
Human-in-the-loop
JIT
Just-In-Time
CAE
Continuous Access Evaluation
CAEP
Continuous Access Evaluation Profile
DPoP
Demonstrating Proof-of-Possession
mTLS
mutual TLS
PII
Personally Identifiable Information
KMS
Key Management Service
CI/CD
Continuous Integration / Continuous Delivery
SIEM
Security Information and Event Management
SOAR
Security Orchestration, Automation and Response
SCIM
System for Cross-domain Identity Management
XACML
eXtensible Access Control Markup Language
OPA
Open Policy Agent
OWASP
Open Worldwide Application Security Project
NIST
National Institute of Standards and Technology
ATLAS
Adversarial Threat Landscape for Artificial-Intelligence Systems
LLM
Large Language Model
WAF
Web Application Firewall
CDN
Content Delivery Network
DDoS
Distributed Denial of Service
DLP
Data Loss Prevention
JWT
JSON Web Token
API
Application Programming Interface
CRS
Core Rule Set (OWASP)
RAG
Retrieval-Augmented Generation
MCP
Model Context Protocol
PBAC
Permission-Based Access Control
HSM
Hardware Security Module
UEBA
User and Entity Behavior Analytics
SBOM
Software Bill of Materials
SLSA
Supply-chain Levels for Software Artifacts
WORM
Write Once, Read Many
SPIFFE
Secure Production Identity Framework For Everyone