I Read Every Line of NemoClaw. Here's What NVIDIA Actually Built.

nemoclawopenclawnvidiasecurityagents

Jensen Huang walked onto the GTC stage and said every company needs an OpenClaw strategy. Then NVIDIA shipped NemoClaw — an open-source stack that promises to make always-on AI agents safe for enterprise.

I run 18 AI agents on a Mac mini using OpenClaw. I've hit the security problems NemoClaw claims to solve. So I cloned the repo and read every file.

Here's what I found.

What NemoClaw Actually Is

Strip the marketing and NemoClaw is three things:

An interactive installer (967 lines of JavaScript) that detects your hardware, picks an inference provider, and wires everything together
A policy YAML file (168 lines) that declares what the agent can and can't do
Glue code that shells out to two closed-source NVIDIA binaries: OpenShell and NIM

NemoClaw itself is ~9,000 lines of JavaScript, ~3,200 lines of TypeScript, and ~3,000 lines of shell scripts. The actual security enforcement? That lives in OpenShell — a closed-source binary you download from GitHub Releases.

In one sentence: NemoClaw is docker-compose with a setup wizard, specialised for running OpenClaw inside NVIDIA's sandbox runtime.

The Architecture

┌─────────────────────────────────────────────┐
│ HOST                                         │
│                                              │
│  nemoclaw CLI (Node.js wizard)               │
│                                              │
│  ┌────────────────────────────────────────┐  │
│  │ OpenShell (closed-source)              │  │
│  │  ├── k3s inside Docker                 │  │
│  │  ├── Egress proxy (inference.local)    │  │
│  │  ├── Policy engine                     │  │
│  │  └── Web TUI dashboard                 │  │
│  │                                        │  │
│  │  ┌──────────────────────────────────┐  │  │
│  │  │ SANDBOX (node:22-slim)           │  │  │
│  │  │  ├── OpenClaw CLI                │  │  │
│  │  │  ├── NemoClaw plugin             │  │  │
│  │  │  └── Your agents run here        │  │  │
│  │  │                                  │  │  │
│  │  │  Network: deny-all + allowlist   │  │  │
│  │  │  Filesystem: /sandbox + /tmp     │  │  │
│  │  └──────────────────────────────────┘  │  │
│  └────────────────────────────────────────┘  │
│                                              │
│  Optional: NIM / Ollama / vLLM               │
└─────────────────────────────────────────────┘

The key architectural trick is the inference proxy. Your agent calls https://inference.local/v1 — a fake hostname that OpenShell intercepts and routes to the real provider with injected credentials. The agent never touches API keys or external endpoints directly.

This is clever. It's the same pattern Kubernetes service meshes use for mTLS — the application doesn't know about the security layer.

The Security Model (What's Real)

The sandbox policy is deny-by-default. Here's what the agent can actually reach:

Endpoint	Who can call it	Why
`api.anthropic.com`	Only `/usr/local/bin/claude`	Claude Code needs its API
`integrate.api.nvidia.com`	`claude` + `openclaw` binaries	Inference routing
`github.com`	Only `gh` + `git` binaries	Code operations
`registry.npmjs.org`	`openclaw` + `npm`	Plugin installs
`api.telegram.org`	Any binary	Bot notifications

The binary pinning is the most interesting part. It's not just domain allowlisting — only specific executables can talk to specific endpoints. Even if malicious code runs inside the sandbox, it can't call api.anthropic.com unless it's executing as the Claude binary.

Filesystem isolation uses Linux Landlock LSM: /sandbox and /tmp are writable, everything else is read-only.

This is a meaningful security posture. If OpenShell implements what the YAML declares, it's substantially safer than running agents on bare metal.

The Catch: You Can't Verify Any of This

OpenShell is a closed-source binary. The policy YAML is a declaration of intent. Whether enforcement: enforce actually enforces anything is unverifiable from the open-source code.

I can read the policy file. I can see that it says "only Claude can talk to Anthropic's API." But the thing that enforces that rule? It's a binary blob I downloaded from NVIDIA's GitHub releases. I can't audit it.

For an enterprise selling trust, this is a problem.

Things That Made Me Raise an Eyebrow

1. Auto-pair approves everything

The startup script launches a watcher that auto-approves ALL device pairing requests for 10 minutes:

# From nemoclaw-start.sh — auto-pair watcher
for device in pending:
    request_id = device.get('requestId')
    arc, aout, aerr = run('openclaw', 'devices', 'approve', request_id)

Combined with dangerouslyDisableDeviceAuth: true and allowInsecureAuth: true in the OpenClaw config. Convenience over security.

2. Landlock is best-effort

landlock:
  compatibility: best_effort

On older kernels (or inside containers without LSM support), filesystem isolation silently degrades. No warning. No failure. Just... less security.

3. Policy merging is string-based

The code that merges policy presets doesn't parse YAML. It splits on newlines, uses regex to find top-level keys, and concatenates strings. It works for the simple preset format, but it's fragile:

const isTopLevel = /^\S.*:/.test(line);

4. Cloud NVIDIA is the default path

The wizard steers you toward integrate.api.nvidia.com. Local inference (NIM) requires 8-80GB VRAM. The product defaults to NVIDIA's cloud inference, which requires an NVIDIA API key.

That's not a security feature. That's a sales funnel.

What I Actually Want vs What NemoClaw Delivers

I run 18 agents. Here's my honest scorecard:

Problem	NemoClaw's answer	My take
Agents can access anything on the host	Container isolation + Landlock	✅ Real solution
No network egress control	Deny-by-default + allowlists	✅ Real solution
API keys exposed to agent code	Inference proxy hides credentials	✅ Clever and useful
No audit trail for agent actions	OpenShell gateway logging	🤷 Can't verify (closed-source)
Policy management at scale	YAML presets per service	✅ Practical, if basic
Multi-agent coordination	Not addressed	❌ Still need Paperclip or similar
Cost tracking	Not addressed	❌ Still need Helicone or similar

The Bottom Line

NemoClaw solves a real problem. Running autonomous AI agents on bare metal with unrestricted access to your filesystem, network, and credentials is genuinely dangerous. The email deletion incident the video mentioned? That's not hypothetical — it's what happens when context windows reset mid-execution and there's nothing between the AI's decision and your production data.

The sandbox design is sound. Deny-by-default networking, binary-pinned egress rules, credential isolation via inference proxy — these are the right architectural decisions.

But the enforcement lives in a closed-source binary. You're trusting NVIDIA's implementation, not verifying it.

My recommendation: If you're running OpenClaw agents that touch anything sensitive, NemoClaw is worth the overhead. The container isolation alone is a meaningful upgrade. Just don't confuse "NVIDIA built it" with "it's been audited." Those are different things.

If you're building an agent fleet like mine — 18 agents across multiple projects — NemoClaw handles the sandbox layer but you still need orchestration (Paperclip), task tracking (Beads), and cost visibility (Helicone). It's one layer of a stack, not the whole stack.

I'm building the 11 Factors of Production-Grade AI Agents — a framework for taking AI agents from prototype to production. NemoClaw touches Factor 8 (Security & Sandboxing) and Factor 5 (Model Serving Layer). Follow along as I document what actually works.

← cd ../blog