Microsoft releases open-source toolkit to govern autonomous AI agents
AI agents can book travel, execute financial transactions, write and run code, and manage infrastructure without human intervention at each step. Frameworks like LangChain, AutoGen, CrewAI, and Azure AI Foundry Agent Service have made this kind of autonomy straightforward to deploy. The governance infrastructure to match that autonomy has lagged behind. Microsoft released the Agent Governance Toolkit to address that gap.

What the toolkit contains
The Agent Governance Toolkit is a seven-package system available in Python, TypeScript, Rust, Go, and .NET. Each package addresses a distinct layer of agent governance:
- The Agent OS package functions as a stateless policy engine that intercepts every agent action before execution at sub-millisecond latency, with a reported p99 latency below 0.1 milliseconds. It supports YAML rules, OPA Rego, and Cedar policy languages.
- Agent Mesh provides cryptographic identity using decentralized identifiers with Ed25519 signing, an Inter-Agent Trust Protocol for agent-to-agent communication, and a dynamic trust scoring system running on a 0 to 1000 scale across five behavioral tiers.
- Agent Runtime introduces execution rings modeled on CPU privilege levels, saga orchestration for multi-step transactions, and a kill switch for emergency agent termination.
- Agent SRE applies service reliability practices, including Service Level Objectives, error budgets, circuit breakers, chaos engineering, and progressive delivery, to agent systems.
- Agent Compliance automates governance verification with compliance grading, mapping to regulatory frameworks including the EU AI Act, HIPAA, and SOC2, and evidence collection covering all ten OWASP agentic AI risk categories.
- Agent Marketplace handles plugin lifecycle management with Ed25519 signing, manifest verification, and trust-tiered capability gating.
- Agent Lightning governs reinforcement learning training workflows with policy-enforced runners and reward shaping, targeting zero policy violations during RL training.
Framework integrations
“A governance toolkit is only useful if it works with the frameworks people actually use. We designed the toolkit to be framework-agnostic from day one,” Imran Siddique, Principal Group Engineering Manager, Microsoft, explained.
The toolkit is designed to work alongside existing agent frameworks without requiring rewrites. It hooks into native extension points: LangChain’s callback handlers, CrewAI’s task decorators, Google ADK’s plugin system, and Microsoft Agent Framework’s middleware pipeline.
Several integrations are operational. Dify carries the governance plugin in its marketplace. LlamaIndex includes a TrustedAgentWorker integration. The OpenAI Agents SDK, Haystack, LangGraph, and PydanticAI integrations are shipped, with OpenAI Agents and LangGraph published on PyPI, Haystack merged upstream, and PydanticAI available as a working adapter.
Security architecture and test coverage
The toolkit’s design draws on established computing patterns: kernel-style privilege separation from operating systems, mutual TLS and identity from service meshes, and SLO-based reliability practices from Site Reliability Engineering.
The toolkit maps its capabilities to all ten OWASP agentic AI risk categories. For example, the policy engine includes a semantic intent classifier to counter goal hijacking. A Cross-Model Verification Kernel with majority voting addresses memory poisoning. Ring isolation, trust decay, and the automated kill switch target rogue agent behavior.
The project ships with more than 9,500 tests across all packages and uses ClusterFuzzLite for continuous fuzzing. The build pipeline includes SLSA-compatible provenance, OpenSSF Scorecard tracking, CodeQL scanning, Dependabot dependency monitoring, and pinned dependencies with cryptographic hashes. The toolkit also includes 20 step-by-step tutorials covering each package.
Licensing and community direction
Microsoft stated in the release that it intends to move the project to a foundation for community governance, and said it is engaging with the OWASP agentic AI community and foundation leaders to facilitate that transition. The project is structured as a monorepo with seven independently installable packages, allowing teams to adopt individual components incrementally.
The toolkit runs on Python 3.10 and later. Individual packages are available on PyPI. For teams deploying on Azure, the toolkit supports sidecar deployment on Azure Kubernetes Service, middleware integration with Azure Foundry Agent Service, and container deployment via Azure Container Apps.
Agent Governance Toolkit is available for free on GitHub.

Must read:
- 40 open-source tools redefining how security teams secure the stack
- Firmware scanning time, cost, and where teams run EMBA

Subscribe to the Help Net Security ad-free monthly newsletter to stay informed on the essential open-source cybersecurity tools. Subscribe here!
