ROIpad ← Back to Search
github.com › repository issue

Safety policy for constraining meta-agent modifications

facebookresearch/HyperAgents
Status: Open
Opened: Mar 28, 2026
Comments: 15
HyperAgents executes model-generated code in a self-improvement loop where the meta-agent rewrites task agent source autonomously. The README correctly flags this as executing "untrusted, model-generated code." We've put together a safety policy pack that constrains what the meta-agent can do during the optimization loop: - **Reads**: unrestricted (meta-agent needs to observe task agent performance) - **Writes**: restricted to `workspace/` only, with approval gate (prevents rewriting evaluation harness, own source, or system files) - **Command execution**: blocked (meta-agent rewrites code; execution goes through the framework) - **File deletion**: blocked (preserves full optimization history) - **Network requests**: blocked (closed-loop optimization, no data exfiltration) - **Rate limit**: 10 tool calls/minute (prevents runaway rewrite cycles) Every allowed and denied action produces a signed receipt. The full run produces a verifiable audit chain — useful for debugging optimization regressions and for reproducibility. The policies are available in both JSON and [Cedar](https://www.cedarpolicy.com/) format (compatible with AWS Verified Permissions): - JSON: [`hyperagent-sandbox.json`](https://github.com/tomjwxf/ScopeBlindD2/tree/main/examples/hyperagents/hyperagent-sandbox.json) - Cedar: [`hyperagent-sandbox.cedar`](https://github.com/tomjwxf/ScopeBlindD2/tree/main/examples/hyperagents/hyperagent-sandbox.cedar) Usage: ```bash npx protect-mcp --policy hyperagent-sandbo...
Python
View on GitHub ↗
Related Content