Back to BlogPerspective

AI Safety Coalitions Are Missing the Authorization Layer

Project Glasswing, Microsoft's Agent Governance Toolkit, and NVIDIA NemoClaw each tackle a different piece of agent safety. None of them is the authorization layer. Here's why that layer matters and why it needs an open standard.

12 min read
by Uchi Uchibeke

In the first quarter of 2026, three of the largest names in AI safety announced three different initiatives. Anthropic launched Project Glasswing, a twelve-organization coalition aimed at finding and patching software vulnerabilities at scale. Microsoft shipped the Agent Governance Toolkit, eleven packages and five SDKs of zero-trust policy enforcement for autonomous agents. NVIDIA used GTC to unveil NemoClaw, a kernel-level sandbox for the OpenClaw agent framework.

All three are important. All three are well engineered. None of them is the authorization layer.

That sounds like a small omission. It is not. It is the difference between a stack that contains accidents and a stack that prevents them.

What Glasswing actually is

Project Glasswing is a coalition. The members read like a who's who of infrastructure: AWS, Anthropic, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks. Anthropic put up $100 million in Claude credits and another $4 million in direct donations to open-source security groups. The vehicle is Claude Mythos Preview, Anthropic's frontier model tuned for vulnerability discovery.

The thesis is straightforward and almost certainly correct: if attackers are going to use frontier models to find zero-days, defenders should get there first. Glasswing is offense-as-defense at industrial scale. It will find bugs. It will fix bugs. It will likely meaningfully reduce the long tail of memory corruption issues in critical open-source software.

What Glasswing does not do is govern autonomous agents. There is no identity layer in the announcement. No capability scoping. No runtime policy enforcement. No mechanism for asking "is this specific agent allowed to take this specific action against this specific resource right now." That is not a criticism. Glasswing is a vulnerability research coalition, not an agent governance coalition. It was never trying to be one.

The problem is that people read coalition announcements as comprehensive AI safety strategies. They are not. They are slices.

What Microsoft's toolkit covers

The Agent Governance Toolkit is the most comprehensive product Microsoft has ever shipped in this space. Eleven packages. Five language SDKs. Twelve framework adapters. GovernancePolicyMiddleware, CapabilityGuardMiddleware, AuditTrailMiddleware. DID-based identity. Full coverage of the OWASP Agentic Top 10. Azure integration on day one.

I covered this in detail in Microsoft just validated the Agent Passport thesis. The short version: when Microsoft drops eleven packages on day one, it means the internal review concluded the category is real. That is the strongest signal pre-action authorization for agents has ever received.

But the toolkit is a product, not a standard. Microsoft's DIDs do not interoperate with anyone else's credentials. Capabilities are defined in Microsoft's schema. The middleware lives in Microsoft's repo. If you want governance from Microsoft, you adopt the library. If you want to mix Microsoft governance with a non-Microsoft credential, or have an APort-issued passport accepted by a Google ADK agent, there is no shared wire format that makes that work.

This is fine if you live entirely inside Azure. It is not fine if you are an enterprise running agents across three clouds, two frameworks, and a dozen internal services.

What NemoClaw covers

NemoClaw, announced at GTC in March, wraps OpenClaw in K3s OpenShell. It does kernel-level network allowlisting. It restricts filesystem writes. It routes privacy-sensitive prompts to local Nemotron models. The policy engine runs out-of-process. The isolation guarantees are strong.

This is real defense-in-depth. The OS-level work matters. If an agent is compromised, NemoClaw makes it dramatically harder for the compromise to escape the box.

But sandboxes do not enforce semantic business rules. A perfectly sandboxed agent can still call the payments API its operator authorized it to call, refund the wrong customer for the wrong amount, and stay entirely within the network and filesystem rules NemoClaw is enforcing. The kernel does not know what a refund is. The sandbox cannot tell the difference between "send $50 to vendor A" and "send $50,000 to vendor Z." Both are network calls to an allowlisted host.

You can sandbox a malicious agent. You cannot sandbox a confused one. That is what authorization is for.

The three camps and the gap between them

AI safety discourse in 2026 has settled into three camps.

The first is model safety: alignment, RLHF, content filtering, jailbreak defense. Anthropic, OpenAI, and Meta lead here. The output is a model that is statistically less likely to produce harmful content. It is probabilistic by design. It is necessary. It is not sufficient.

The second is vulnerability research: using frontier models to find software bugs before attackers do. Glasswing is the flagship effort. The output is fewer exploitable bugs in the world. Necessary. Not sufficient.

The third is sandboxed execution: containing damage when agents do something harmful. NemoClaw, Google ADK Sandbox, E2B, gVisor-based isolators. The output is a smaller blast radius when things go wrong. Necessary. Not sufficient.

Look at what is missing. Model safety hopes the agent will not try. Vulnerability research patches the holes the agent could exploit. Sandboxing limits what the agent can touch. None of them answers the question that matters at the moment of action: is this specific agent, with this specific identity, holding this specific capability, allowed to perform this specific operation against this specific resource, right now, under the policy its operator has declared?

That is pre-action authorization. That is the authorization layer. And it is missing from every coalition initiative announced this year.

Why semantic enforcement matters

The default failure mode for an autonomous agent is not malice. It is confusion. A finance agent misreads a query and tries to refund a customer ten thousand times the correct amount. A support agent misroutes a PII export request to the wrong customer. A code agent rewrites a config file because a tool description was ambiguous.

None of these failures are caught by alignment, by sandboxing, or by vulnerability scanning. The model is "aligned." The sandbox permits the API call. The code has no exploitable bugs. The agent simply does the wrong allowed thing.

The authorization layer is the only place where you can write the rule "this agent cannot refund more than $X to a non-allowlisted recipient" and have it enforced deterministically before the call goes out. It is the only place where the rule lives outside the model's reasoning, outside the sandbox's network policy, and outside the vendor's product roadmap. It is policy as code, evaluated at the tool-call boundary, with a signed receipt for the audit log.

This layer needs to be deterministic, portable, and open. Deterministic because probabilistic safety is not a control. Portable because agents move across frameworks, clouds, and vendors. Open because a standard owned by one vendor is a product, not a standard.

Why none of the coalitions builds this

Each of these efforts has structural reasons not to build the authorization layer.

Glasswing is a vulnerability research coalition. Its members signed up to find bugs. Adding agent identity, capability scoping, and runtime policy enforcement would change the scope of the program entirely. It is not a failure of vision. It is just a different problem.

Microsoft builds products. Microsoft's commercial interest is in Azure-integrated governance that customers buy as part of a broader enterprise agreement. A truly open standard, where any conforming implementation can verify any conforming credential, slightly weakens the lock-in. It is not in Microsoft's narrow interest to publish the wire format. The toolkit is excellent. The spec is conspicuously absent.

NVIDIA builds infrastructure. The company's center of gravity is the kernel, the GPU, and the runtime. Semantic policy enforcement at the business-rule level is several layers above where NVIDIA naturally invests. NemoClaw is the right product for NVIDIA to ship. It is not the authorization layer because NVIDIA does not build authorization layers.

This is the pattern that has produced every major open infrastructure standard in the last twenty years. OAuth was not built by Google or Twitter, even though both shipped competing token schemes. OIDC was not built by Microsoft, even though it had a perfectly good identity product. OpenTelemetry was not built by any single APM vendor. OCI was not built by Docker. In each case, the vendors had products. The community needed a standard. The standard came from outside the vendors because no vendor had the incentive to give away the lock-in.

The OAuth parallel

Before OAuth, every service had its own authorization scheme. Enterprise integrations were a nightmare of bespoke token formats and per-vendor SDKs. OAuth won not because it was the best library, but because it was the spec nobody owned. You can be an OAuth server, an OAuth client, or a resource owner, and the three interoperate without anyone asking permission.

Pre-action authorization for agents needs the same thing. An open specification for agent passports that any framework, any evaluator, any registry, and any sandbox can implement. A wire format that travels with the agent across vendors. A verification model that does not depend on a single hosted service.

That spec exists. The Open Agent Passport is Apache 2.0, published as an open standard with a permanent identifier (the companion paper has DOI 10.5281/zenodo.18901596), and implemented across six-plus frameworks today. It defines a signed, portable credential that asserts what an agent is, who issued it, what it is allowed to do, and under what conditions. It is verifiable in under 100 milliseconds. It does not require you to adopt anyone's product. It does not belong to APort. APort is one implementation. Microsoft's middleware could conform to it. Google's could. NVIDIA's policy engine could check passports at the syscall boundary if they wanted. The point of a standard is that anyone can implement it.

For the long version of how this compares to other guardrail approaches, see the 2026 pre-action authorization comparison.

The complementary stack

This is not a "pick APort over Microsoft" pitch. Any serious production agent needs all of the following.

  1. Model safety. Anthropic, OpenAI, Meta. Aligned weights, content filters, jailbreak defenses.
  2. Content guardrails. LlamaGuard, NeMo Guardrails. Input and output filtering at the prompt boundary.
  3. Sandboxed execution. NemoClaw, E2B, Google ADK Sandbox. Kernel and network isolation.
  4. Authorization layer. OAP-conformant passports verified before every consequential action. APort, Microsoft's toolkit if it adopts the spec, PCAS, anything else that implements the standard.
  5. Audit and compliance. Signed decision receipts, tamper-evident logs, NIST AI RMF and OWASP Agentic Top 10 mappings.
  6. Vulnerability research. Glasswing-style offense-as-defense to harden the underlying software stack.

Glasswing covers layer six. The Microsoft toolkit covers layers four and five inside Microsoft's product surface. NemoClaw covers layer three. None of these initiatives covers the authorization layer as a standard rather than as a product.

The gap is specific. It is not "AI safety needs more attention." It is "the authorization layer needs an open spec, multiple conforming implementations, and upstream framework integration, and right now only one of those three is well underway."

What a coalition for this layer would look like

If someone wanted to build the Glasswing of agent authorization, the components are already mostly in place.

There is an open specification. OAP, Apache 2.0, in the aport-spec repository. It needs a working group, not a rewrite.

There are reference implementations. APort runs in production. Microsoft's Agent Governance Toolkit could conform with modest effort. Anyone shipping a policy engine for agents could too.

There are upstream integrations. DeerFlow merged passport verification at the tool boundary. OpenClaw has a PR in review. LangChain and CrewAI integrations exist. The path to broad framework coverage is short.

There is a verification model. aport.io exists today as a hosted registry. The hosted version is one implementation of the spec, not the spec itself.

There are compliance mappings. NIST AI RMF, OWASP Agentic Top 10, SAFE-MCP. Each is a tractable mapping exercise rather than a research project.

What is missing is a working group. Not a product roadmap. Not a vendor coalition aimed at extending one company's product. A small group of implementers and adopters who agree that the wire format is more important than whose logo is on the library.

What I want from the next announcement

If you are at Anthropic and you are thinking about what comes after Glasswing, the authorization layer is the obvious next coalition. The members are already in the room. The frontier model work is complementary, not competitive. A signed credential is a control that pairs cleanly with a model that has been hardened against jailbreaks.

If you are at Microsoft and you are deciding what to ship in the next toolkit release, publishing the wire format your middleware speaks would do more for the ecosystem than the next twelve framework adapters. The library is the on-ramp. The spec is the road.

If you are at NVIDIA and you are extending NemoClaw, adding a passport verification hook at the policy engine boundary would make the sandbox semantically aware in a way that kernel rules alone cannot be.

If you are an enterprise security lead rolling out agents, the question to ask every vendor is the one I asked in the last post: which open standard does your governance layer speak? If the answer is "ours," that is your answer.

We built APort because this layer was missing. Microsoft validated it by shipping a version in March. The next step is the open standard. AI safety coalitions keep announcing defense in depth. The authorization layer is the piece that turns depth into prevention. It does not belong to any single vendor. It belongs to the ecosystem that adopts it.

The spec is public. The implementation is open source. If you are building this layer, in any company, for any framework, we should talk.