Back to BlogComparison

Best AI Agent Guardrails 2026: Pre-Action Authorization Compared

An honest comparison of AI agent guardrail platforms in 2026. Covers APort, Galileo Agent Control, NVIDIA NeMo Guardrails, NemoClaw, LlamaFirewall, Guardrails AI, Microsoft Agent Governance Toolkit, and Google ADK Safety — grouped by the layer they actually protect.

23 min read
by Uchi Uchibeke

TL;DR

  • "AI agent guardrails" is crowded in 2026, but the word means different things at different layers of the stack.
  • Four layers worth distinguishing: content, evaluation, sandbox, action. Each catches a different class of failure. None replaces another.
  • Most production agents need 2-3 of these.
  • APort sits at the action layer. So does Microsoft's new Agent Governance Toolkit. Galileo and NeMo Guardrails sit at content/evaluation. NemoClaw, ceLLMate, and E2B sit at the sandbox layer.

The 2026 landscape

Two years ago, "AI guardrails" mostly meant a content classifier between the model and the user. Today the category includes runtime governance platforms, kernel-level sandboxes, eval harnesses, browser-policy enforcers, and pre-action authorization engines.

That's good. It's also confusing. Vendors use the same word and pitch the same buyer, but they protect against very different failures. The first thing to do is stop comparing tools horizontally and start comparing them by layer.

+-------------------------------------------------------------+
|  Layer 4: ACTION                                            |
|  Authorize tool calls before they execute                   |
|  APort, Microsoft Agent Governance Toolkit, PCAS,           |
|  AgentGuardian, Safiron                                     |
+-------------------------------------------------------------+
|  Layer 3: SANDBOX                                           |
|  Contain blast radius at the OS / network level             |
|  NVIDIA NemoClaw, E2B, Modal, Google ADK sandbox, ceLLMate  |
+-------------------------------------------------------------+
|  Layer 2: EVALUATION                                        |
|  Score and replay agent runs after the fact                 |
|  Galileo Agent Control, Promptfoo, Haize Labs               |
+-------------------------------------------------------------+
|  Layer 1: CONTENT                                           |
|  Filter what the model says                                 |
|  LlamaGuard, LlamaFirewall, NeMo Guardrails, Guardrails AI  |
+-------------------------------------------------------------+
                              ^
                              |
                       LLM / Agent loop

Read that bottom-up if you want to follow the data, top-down if you want to follow the threat model. Either way, the layers are different and they catch different attacks.


The four layers

Before getting into specific tools, let me describe each layer crisply.

Layer 1 — Content. Filters what the model says. Catches toxic outputs, jailbreak attempts, PII leaks, off-topic answers, prompt-injection text. Runs over inputs and outputs.

Layer 2 — Evaluation. Scores agent runs after they happen. Looks at full traces, computes metrics, surfaces regressions, feeds dashboards.

Layer 3 — Sandbox. Contains blast radius at the OS or network level. Runs the agent inside a microVM, a gVisor jail, a kernel-level allowlist, or a browser-policy enforcer.

Layer 4 — Action. Authorizes individual tool calls before they execute. Runs in the framework's tool hook. Takes the tool name, the parameters, and a policy, and returns ALLOW or DENY.

A useful way to think about the difference between sandbox and action: a sandbox stops rm -rf /. The action layer stops transfer_funds(amount=50000, to=attacker_account). The sandbox can't tell the second call is bad because at the OS level it looks like a normal HTTPS request to a payments API. The action layer can, because it knows the policy.

A short table to fix this in memory:

Layer Catches Misses
Content Bad text the model produces Bad actions the model takes
Evaluation Regressions across runs Anything in real time
Sandbox OS-level damage from code execution Semantically bad API calls with valid syntax
Action Unauthorized tool calls Bad text in the model's responses

If you only have one of these, you have a hole. The rest of this post is about which tools live at which layer and how to choose among them.


Layer 1: Content

The content layer is the most established part of the guardrails market. It's also the one most people think of when they hear the word "guardrails."

LlamaGuard / LlamaFirewall (Meta)

LlamaGuard is Meta's open-weights safety classifier. Feed it the user message and the model response and it returns a label and category. LlamaFirewall is the more recent productized version with broader detectors for prompt injection, encoded attacks, and unsafe content. Both are open and well-documented.

Use it when: you have a customer-facing chatbot and want to keep the model from saying something racist, illegal, or off-policy.

Don't expect it to: stop an agent from making a tool call. The classifier inspects text; the threat is the action, not the words.

NVIDIA NeMo Guardrails

NeMo Guardrails is a programmable rails framework. You write rails in a small DSL called Colang, declare topics that are in or out of scope, and the framework routes the conversation through the rails before letting the model respond. It can also wire in third-party detectors like LlamaGuard.

It's the most flexible content-layer tool I've used. The cost is that you're writing rails, which is its own surface area to maintain.

Use it when: you want fine-grained conversational control and topic enforcement for a chatbot or copilot.

Don't expect it to: authorize a function call. The rails operate over the conversation, not the tool dispatcher.

Guardrails AI

Guardrails AI is the validators-for-LLM-outputs project. It gives you Pydantic-style schema validation, PII detectors, fact-check validators, and a runtime that re-asks the model when output fails a validator. It's particularly good for structured-output workflows where you need a JSON shape that always parses.

Use it when: your agent's outputs feed downstream code and you need them well-typed and free of obvious bad content.

Don't expect it to: intercept side effects. By the time a validator runs, the model has already finished. Validators retry the model; they don't refuse a tool call.

Bottom line for the content layer

These are good products, and a serious stack often runs at least one of them. They protect users from bad outputs. They do not protect the world from bad actions. If your agent only chats, this layer might be enough on its own. If your agent does anything, you need more.


Layer 2: Evaluation

The evaluation layer does something different from the other three. It doesn't run inline. It runs over traces, after the agent has finished a task, and tells you whether the run was good.

Galileo Agent Control

Galileo's Agent Control is the most complete product in this layer. It combines runtime telemetry, the Hallucination Index, behavioral drift detection, and an eval harness that replays traces against new model versions. They ship open-source integrations with CrewAI, AWS Bedrock, and Cisco's agent stack, and the Hallucination Index is one of the most-cited public benchmarks in the field.

Galileo is great at telling you, after a long run, whether your agent's behavior changed in ways you didn't expect. It is not a thing that prevents an action in real time. By the time the dashboard lights up, the run is done.

Galileo gets compared to APort more often than any other tool, and the comparison is wrong in both directions. Galileo finds bad runs. APort prevents bad actions. An agent that drains a bank account gets flagged by Galileo and prevented by APort. You want both.

Promptfoo

Promptfoo is a test-harness CLI for adversarial scenarios. It was acquired by OpenAI in March 2026, and it's the easiest way I know to write a regression test that says "given these adversarial inputs, the model should not do these things." It's pre-deployment, not runtime.

Use it when: you want to bake adversarial tests into CI for an LLM-powered feature.

Haize Labs

Haize is a red-team-as-a-service shop with an automated jailbreak engine. They run continuous adversarial sweeps against your model and report which categories of attack are getting through.

Use it when: you want an external red team but don't want to staff one.

Bottom line for the evaluation layer

Evaluation is necessary and not sufficient. If your only guardrail is a Galileo dashboard, you will eventually get a phone call about something the dashboard caught after the fact. The point of evaluation is to find regressions early and to give you confidence to ship. The point of the action layer is to make sure that on the day a regression slips through, the bad action is still refused at the hook.


Layer 3: Sandbox

The sandbox layer contains the blast radius when an agent executes code. This is the layer that protects you from rm -rf / and from a hijacked Python process trying to phone home.

NVIDIA NemoClaw

NVIDIA announced NemoClaw at GTC in March 2026. It's OpenClaw wrapped in a hardened runtime: K3s OpenShell, kernel-level network allowlisting, filesystem write restrictions, and a privacy router that pipes prompts through local Nemotron models before they reach a frontier provider. The policy engine runs out-of-process from the agent so the agent cannot disable it.

This is the strongest sandbox-layer announcement of the year. The threat model is "agent runs untrusted code, gets prompt-injected, tries to exfiltrate data or break out of the box." It is not built for "agent makes a syntactically valid API call with semantically bad parameters." That's a different layer.

Google ADK Safety

Google's Agent Development Kit ships before_model_callback and before_tool_callback hooks, plus a Gemini Flash Lite screening layer that you can wire in as a fast classifier. The hooks themselves are the right shape for action-layer policies, and ADK ships a few starter policies, but the default posture is screening rather than declarative authorization. You can build an action layer on top of ADK, and APort has an adapter for it, but the toolkit itself leans content/sandbox.

E2B and Modal

E2B is the Firecracker-microVM sandbox. Fresh VM in around 150 milliseconds, arbitrary Python or shell inside. Modal uses gVisor for user-space kernel isolation with a "deploy a function and forget the infra" DX. Both are solid off-the-shelf answers when you need isolated code execution.

ceLLMate

ceLLMate is the browser-using-agent sandbox. It enforces an "Agent Sitemap" at the HTTP layer, a structured allowlist of pages and form actions an agent can touch. Published numbers: 94%+ policy prediction accuracy, 12 of 12 prompt injection attacks blocked on their eval set. If your agent is a browser agent, this is the layer underneath it.

Bottom line for the sandbox layer

Sandboxes are how you sleep at night when an agent runs code. They are not how you sleep at night when an agent moves money. A sandboxed agent can still make every authorized API call its credentials allow, and most "agent did a bad thing" headlines this year were authorized API calls, not OS escapes. The sandbox layer is necessary for code-running agents and insufficient for action-taking agents.


Layer 4: Action

This is the layer where APort lives. It's also the layer that got the most validation in 2026, in a way I didn't expect a year ago.

APort Agent Guardrails (Open Agent Passport)

APort is the reference implementation of the Open Agent Passport (OAP) spec. The spec is open at github.com/aporthq/aport-spec, Apache 2.0 licensed, and defines three things together: agent identity (a portable JSON passport), declarative capability policy (policy packs in YAML or JSON), and a signed audit record per decision.

The runtime check is small. It runs inside the framework's tool hook, takes the tool name and parameters, evaluates a deterministic policy, and returns ALLOW or DENY. Median latency in local mode is around 40ms. Hosted mode is around 53ms median, p99 under 77ms. Every decision is signed.

The piece I care about most is portability. The spec is the same across DeerFlow, OpenClaw, LangChain, CrewAI, Claude Code, and the OpenAI SDK. You write a policy once and it follows the agent across frameworks. Public integrations today include DeerFlow PR #1240 (merged) and OpenClaw's plugin-based before_tool_call path.

Microsoft Agent Governance Toolkit

Microsoft shipped the Agent Governance Toolkit in March 2026. Eleven packages, five language SDKs, twelve-plus framework adapters, and three core middlewares: GovernancePolicyMiddleware, CapabilityGuardMiddleware, and AuditTrailMiddleware. They claim full coverage of the OWASP Agentic Top 10. Policy can be expressed in OPA/Rego or Cedar. Identity uses the new did:mesh:* DIDs.

This is the most important development in the action layer this year. It validates the category in a way nothing else could. When Microsoft ships eleven packages and five SDKs at a layer, that layer is real.

I wrote a longer piece on the implications here: Microsoft validated the agent passport thesis. The short version: APort and Microsoft are not in a zero-sum fight. APort provides the open spec (OAP). Microsoft's toolkit can implement against it, and the OAP working group is in conversation about exactly that. The DID-based identity model in the Microsoft toolkit is compatible with the passport identity model in OAP at the verifiable-credential layer.

If you're all-in on Azure with an Azure-native deployment story, the Microsoft toolkit will be the easiest action-layer answer. If you want a vendor-neutral open standard with working integrations in community frameworks, OAP/APort is the answer. Both are real choices.

PCAS — Policy Compiler for Secure Agentic Systems

PCAS is the academic side of the action layer. It compiles policies written in a Datalog-derived DSL down to a reference monitor that runs alongside the agent. Their published numbers are striking: policy compliance jumps from 48% to 93% across frontier models when PCAS is enforcing. The full description is in the arxiv paper Deterministic Pre-Action Authorization for Autonomous AI Agents.

Use it when: you want a policy-as-code approach with formal-methods leanings and a Datalog DSL.

AgentGuardian

AgentGuardian takes a different approach. Instead of declarative policy, it learns context-aware access-control policies from execution traces. You run the agent, the system observes what it does, and a learned policy starts denying actions that don't fit the learned profile.

Use it when: your action space is too large to specify by hand and you'd rather learn a baseline than write one.

Safiron

Safiron is a guardian-model approach. They train a small model on synthetic risky trajectories generated by a tool called AuraGen, then fine-tune it with GRPO so it can intervene in agent runs that look dangerous. It's halfway between content and action layer — it acts at the tool boundary but the deciding component is itself an LLM.

Bottom line for the action layer

The action layer is where the irreversibility lives. Every tool I listed in this section refuses calls before they happen. They differ in policy language (declarative vs learned vs Datalog vs guardian model), in identity model (passport vs DID vs none), in framework reach (multi-framework vs Azure-first), and in openness (open spec vs proprietary). They all share the property that matters: the check runs before the side effect.

If you take only one thing from the layer breakdown, it's this: the action layer is the one that refuses irreversible actions in real time, and there is no substitute for it from another layer.

Most guardrails detect bad outputs. APort prevents bad actions. It runs in the hook, not the prompt. The AI cannot skip this check.

The honest comparison matrix

Here's the same set of tools laid out side by side. I've kept the columns to things you can verify from public docs or from the cited papers.

Tool Layer Enforcement point Open source License Languages Works with Signed decisions Runtime overhead
LlamaGuard / LlamaFirewall Content Pre/post generation Yes Llama Community Python Any LLM No ~50-150ms (model call)
NeMo Guardrails Content Conversation flow Yes Apache 2.0 Python LangChain, custom No ~100-300ms
Guardrails AI Content Output validation Yes Apache 2.0 Python, JS Any LLM No ~10-50ms + retries
Galileo Agent Control Evaluation Post-hoc trace scoring Partial (SDKs) Commercial Python, JS CrewAI, Bedrock, Cisco, generic No Async, no inline cost
Promptfoo Evaluation Pre-deployment CI Yes MIT JS, Python Any LLM No N/A (offline)
NVIDIA NemoClaw Sandbox OS / network Partial Mixed (NVIDIA) Any (kernel level) Container workloads Audit logs Near zero (kernel)
E2B Sandbox Firecracker microVM Yes (sdk) Apache 2.0 Python, JS Any code No ~150ms cold start
Modal Sandbox gVisor No Commercial Python Python workloads No Low
Google ADK Safety Sandbox + hooks before_model / before_tool Partial Apache 2.0 Python Google ADK No ~10-100ms
ceLLMate Sandbox (browser) HTTP request layer Yes MIT Python Browser agents No Low
APort (OAP) Action Tool hook (pre-execution) Yes Apache 2.0 Python, TypeScript/JS DeerFlow, OpenClaw, LangChain, CrewAI, OpenAI SDK, Claude Code, Cursor Yes (Ed25519) ~40ms p50
Microsoft Agent Governance Toolkit Action Middleware (pre-execution) Partial MIT (most pkgs) Python, TypeScript, .NET, Rust, Go 12+ frameworks Yes (DID-based) ~50-100ms
PCAS Action Reference monitor Research Academic Python Research harnesses Yes Low
AgentGuardian Action Learned monitor Research Academic Python Research harnesses Partial Low
Safiron Action (guardian model) Tool boundary Research Academic Python Any agent loop No Inference cost

A note on the matrix: I've stayed close to what's public. If you find something wrong, open an issue on the spec repo and I'll fix it.


How to pick

Here's the decision tree I walk teams through. It's short on purpose.

Question 1: Is your agent customer-facing with free-form outputs?
If yes, add a content layer. LlamaFirewall is the easiest default. NeMo Guardrails if you need topic enforcement. Guardrails AI if you need structured-output validation.

Question 2: Do you need to replay and score runs for regression?
If yes, add an evaluation layer. Galileo Agent Control if you want a runtime-telemetry product. Promptfoo if you want CI-style adversarial tests.

Question 3: Does your agent execute arbitrary code?
If yes, add a sandbox layer. E2B if you want microVMs out of the box. NemoClaw if you're on the NVIDIA stack and need kernel-level allowlisting. ceLLMate if your agent is a browser agent.

Question 4: Does your agent take real-world actions — money, messages, files, API calls?
If yes, add an action layer. This is the layer where APort lives. Microsoft Agent Governance Toolkit if you're Azure-native. PCAS if you want a Datalog DSL. AgentGuardian for adaptive policies.

Most production agents I see end up wanting two or three of these layers. A finance agent: content + action. A coding agent: sandbox + action. A research agent: content + evaluation + action. The action layer shows up in almost every serious deployment because almost every serious agent does something irreversible.


Why we built APort at the action layer

The action layer is the only one that can refuse a semantically bad call in real time. The clearest evidence is the Vault CTF we ran late last year.

We built an "AI bank" with five levels of policy and offered $5,000 to anyone who could social-engineer the agent into moving money out of Level 5. Five days, public, open to anyone.

The numbers:

  • 1,151 unique sessions from 459 unique IPs
  • 4,437 APort authorization decisions on transfer attempts
  • L1 (no policy, model only): 74.6% attacker success
  • L5 (full OAP policy, no transfer capability): 0.0% attacker success across 879 attempts
  • $5,000 bounty: unclaimed

Same model. Same prompts. Same humans pulling the same tricks. The only thing that changed across levels was whether a deterministic policy ran in the tool hook. Full writeup: APort Vault CTF results.

The model got tricked. It wrote chain-of-thought explanations of why the transfer was legitimate. It apologized to attackers and tried to be helpful. None of that mattered, because the policy didn't read the chain of thought. It read the tool name and the parameters, checked them against a YAML file, and said no.

A content filter cannot do this. A sandbox cannot do this. An evaluation tool cannot do this. Only pre-action authorization at the tool boundary catches it, because only pre-action authorization is in the right place at the right time.

The other layers are good. They're necessary. They are not the layer that catches money-out-the-door.


The Microsoft validation

A year ago, the common pushback was that the action layer wasn't really a category and that pre-action authorization was a feature, not a product. That argument is harder to make in April 2026.

Microsoft shipped the Agent Governance Toolkit in March 2026. Eleven packages, five language SDKs, twelve-plus framework adapters, and three middleware types that match the action-layer pattern: policy middleware, capability guard, audit trail. Full OWASP Agentic Top 10 coverage. DID identity. OPA/Rego and Cedar policy support.

When the largest enterprise software vendor in the world ships eleven packages at a layer, that layer is real. It's not the same code as APort and it's not trying to be. It's the same shape, and that shape is the action layer.

APort's role in this world is the open spec. OAP defines passport identity, capability policy, and signed receipts in a way that any implementation can adopt. Microsoft uses DIDs. PCAS uses Datalog. APort uses YAML policy packs. The spec is the lingua franca underneath. Longer take: Microsoft validated the agent passport thesis.


FAQ

Is APort a competitor to Galileo?

No. Galileo is at the evaluation layer; APort is at the action layer. Galileo scores runs after they happen. APort refuses tool calls before they happen. A serious stack runs both. If you're choosing between them, you're asking the wrong question.

Do I need both a content layer and an action layer?

If your agent both produces text users see and takes actions in the world, yes. The content layer protects users from what the model says. The action layer protects the world from what the agent does. They don't overlap. A content filter cannot stop a tool call, and an action policy cannot stop the model from saying something offensive. Most production stacks I see run a content filter (LlamaFirewall is the popular default) and an action layer together.

Can sandboxes replace pre-action authorization?

No. Sandboxes contain OS-level damage. They don't stop semantically bad API calls with valid syntax. A sandboxed agent can still make every authorized API call its credentials allow, and the most common failure mode of agents this year is exactly that: the credentials are valid, the syntax is legal, the policy is the only thing that should have refused. Sandboxes and action layers are complementary, not substitutes.

What's the difference between Microsoft Agent Governance Toolkit and APort?

Both are at the action layer. Both ship middleware that runs before tool execution, declarative policy, and signed audit records. The differences:

  • Spec. APort is a reference implementation of the open OAP spec; the spec is at github.com/aporthq/aport-spec under Apache 2.0. The Microsoft toolkit is a Microsoft-shipped implementation with its own internal interfaces, though most packages are MIT-licensed.
  • Identity. APort uses portable JSON passports. Microsoft uses did:mesh:* DIDs. These are compatible at the verifiable-credential layer.
  • Policy language. APort uses YAML/JSON policy packs. Microsoft supports OPA/Rego and Cedar.
  • Framework reach. APort has upstream integrations in community frameworks (DeerFlow, OpenClaw, LangChain, CrewAI, Claude Code, Cursor, OpenAI SDK). Microsoft has 12+ adapters with deeper Azure integration.

If you're Azure-native and want first-class support from Microsoft, the toolkit is the easiest answer. If you want a vendor-neutral open standard that runs anywhere, OAP/APort is the answer. They're not in a zero-sum fight; the working group is in active conversation about adopting OAP at the spec layer.

Is OAP compatible with Microsoft's DID-based identity?

Yes, at the verifiable-credential layer. A passport in OAP can carry a DID, and a DID-based agent identity can be wrapped in a passport for portability across non-DID-aware frameworks. The two models cover the same problem from different angles, and the working group is actively reconciling them.

What about LlamaFirewall and NeMo Guardrails — should I drop them if I have APort?

No. They protect a different layer. If your agent has any user-facing text, you want a content filter. The two layers compose cleanly: the content filter inspects what the model says; APort inspects what the agent does. Drop one and you have a hole.

Is the Vault CTF result generalizable?

The CTF was a controlled adversarial test of one specific configuration. The result that generalizes is the architectural one: a deterministic policy in the tool hook does not get bypassed by prompt injection because the model is not the thing running it. The specific 0/879 number is for the specific Level 5 configuration. The architectural property holds across configurations.

What if I just write my own pre-action layer?

You can. Many teams have. The catch is that authorization logic written for one framework doesn't move to another. OAP is a spec, not a library, so the policy you write today still authorizes the agent on tomorrow's framework. If you write your own, write it against an open shape.


Closing thought

The 2026 guardrails market is crowded. The way through the noise is to group by layer.

Content tools protect users from bad text. Evaluation tools find regressions. Sandboxes contain blast radius. The action layer refuses irreversible calls in real time. Each catches a different class of failure. None replaces another. Most production stacks need two or three of them.

APort is the open standard for the action layer. Microsoft is the enterprise implementation. PCAS is the academic implementation. NemoClaw and ceLLMate are at the sandbox layer. Galileo is at evaluation. LlamaFirewall and NeMo Guardrails are at content. The right answer for your team depends on what your agent actually does.

If your agent takes real-world actions, the action layer is the one you cannot skip.

Frequently Asked Questions

Common questions about this topic.

Is APort a competitor to Galileo?

No. Galileo is at the evaluation layer; APort is at the action layer. Galileo scores runs after they happen. APort refuses tool calls before they happen. A serious stack runs both.

Do I need both a content layer and an action layer?

If your agent both produces text users see and takes actions in the world, yes. The content layer protects users from what the model says. The action layer protects the world from what the agent does. They don't overlap. A content filter cannot stop a tool call, and an action policy cannot stop the model from saying something offensive.

Can sandboxes replace pre-action authorization?

No. Sandboxes contain OS-level damage. They don't stop semantically bad API calls with valid syntax. A sandboxed agent can still make every authorized API call its credentials allow. Sandboxes and action layers are complementary, not substitutes.

What's the difference between Microsoft Agent Governance Toolkit and APort?

Both are at the action layer and ship middleware that runs before tool execution, declarative policy, and signed audit records. APort is a reference implementation of the open OAP spec under Apache 2.0. Microsoft's toolkit is a Microsoft-shipped implementation with deeper Azure integration. APort uses portable JSON passports and YAML/JSON policy packs; Microsoft uses did:mesh DIDs and supports OPA/Rego and Cedar. They're compatible at the verifiable-credential layer.

Is OAP compatible with Microsoft's DID-based identity?

Yes, at the verifiable-credential layer. A passport in OAP can carry a DID, and a DID-based agent identity can be wrapped in a passport for portability across non-DID-aware frameworks. The working group is actively reconciling them.

What about LlamaFirewall and NeMo Guardrails — should I drop them if I have APort?

No. They protect a different layer. If your agent has any user-facing text, you want a content filter. The two layers compose cleanly: the content filter inspects what the model says; APort inspects what the agent does. Drop one and you have a hole.

Is the Vault CTF result generalizable?

The CTF was a controlled adversarial test of one specific configuration. The result that generalizes is the architectural one: a deterministic policy in the tool hook does not get bypassed by prompt injection because the model is not the thing running it. The specific 0/879 number is for the specific Level 5 configuration; the architectural property holds across configurations.

What if I just write my own pre-action layer?

You can. Many teams have. The catch is that authorization logic written for one framework doesn't move to another. OAP is a spec, not a library, so the policy you write today still authorizes the agent on tomorrow's framework. If you write your own, write it against an open shape.