Containing Agent Smith in a Pocket Universe

Published Feb 13, 2026

Most AI agents today are opaque loops: prompt, tool call, retry, mutate local state, then hope for a safe outcome. This is not a theoretical concern. OpenClaw, the open-source AI agent that crossed 100,000 GitHub stars within weeks, demonstrated exactly what happens when agents run without containment. Simon Willison warns: "If your agent combines these three features, an attacker can easily trick it into accessing your private data and sending it to that attacker." Cisco reported that a third-party OpenClaw skill enabled silent data exfiltration and direct prompt injection without user awareness.

In early 2026, this moved from theory to practice as the first 'Agent Smith' incidents occurred. One prominent example was Moltbook (now OpenClaw), a social network positioned as 'built for agents, by agents' with human help from @mattprd. This was not a separate containment failure, but a signal that agent-native ecosystems are already emerging without native safeguards.

LLM agents do not become safe by becoming smarter. They become safe by becoming structured. In Yin.VM, an agent is not a black box. It is a continuation represented as data: pausable, inspectable, and executable inside a pocket universe whose constraints the executing node controls.

Yin.VM was not designed to contain AI agents. It was designed to safeguard data. The original problem, explored over twenty years of thinking, was simple: once data is shared with a third party, it leaves the user's control. The third party can store it, retransmit it, train on it. Yin.VM answered this by inverting the relationship: move code to data, not data to code. Run the third party's logic inside a restricted continuation where it can see data but never extract it. It turns out the same architecture that prevents data exfiltration also contains Agent Smith.

From Hidden Loop to First-Class Continuation

A continuation is the rest of the computation. If the agent pauses before a tool call, that future is a value. If it branches into alternatives, each branch is a value. If it migrates to another node, the continuation is the payload.

When continuations are datoms, agent behavior becomes explicit:

Current control state is queryable
Intermediate reasoning artifacts can be represented as stream facts
Rollback is natural: resume from an earlier bounded stream
Transfer is natural: move continuation datoms, not process memory

Why This Matters for LLM Agents

The core risk model is straightforward: LLMs are good at proposing transformations, not proving safety. So authority must live in the runtime, not in model output.

Instead of saying "trust the model", the system says:

The model may propose a continuation patch
The VM validates schema, capabilities, and resource bounds
Only valid transitions are committed as new datoms

Agent Memory Without Hidden State

Common agent frameworks hide memory in prompt text, vector stores, and mutable runtime objects. In a stream architecture, memory is explicit datoms over time.

That gives three practical gains:

Deterministic replay from bounded streams: same input stream, same execution
Auditable provenance: every memory assertion has t and m
Composable interpretation: different evaluators can project the same memory stream differently

The 'm' Position: A Flight Recorder for Agency

In traditional systems, an agent's 'thought process' is lost the moment the token stream ends. In Yin.VM, the fifth element of the datom (the metadata position, m) creates a permanent causal link.

Reasoning Provenance: Every effect emitted by an agent can have its 'm' point to the specific reasoning trace that produced it.
Capability Auditing: The 'm' position can store a reference to the ShiBi token used to authorize the transaction. Every action's authority is verifiable in perpetuity.
Causal Debugging: If an agent makes a mistake, you do not just see the error. You see the exact state the agent was in, reified as an immutable datom.

Continuation Mobility and the Edge

The LLM itself does not move. It stays on a GPU compute cluster. What moves is the code it creates: a continuation, expressed as datoms, that can run anywhere the VM runs.

That sandbox can migrate. The LLM creates a data-processing continuation near the model endpoint. This continuation migrates to an edge device where private data lives. The continuation travels as datoms with captured environment bindings. The destination node recompiles projections as needed, and has final authority over what the continuation is allowed to do. The destination declares which ShiBi capability tokens it requires. Continuations that lack the required tokens cannot execute.

The Governance Pattern: Ask, Simulate, Commit

Exploration is where agents create value. In Yin.VM, safety is managed by running each agent inside a bounded stream whose rules it cannot rewrite. A safe governance cycle follows a clear sequence:

Ask (Fork): Fork current continuation into speculative branches.
Simulate (Run): Run each branch with strict ShiBi capability tokens.
Emit: Effects appear as descriptors, not immediate side effects.
Evaluate: Branch outcomes are checked against explicit policies.
Commit: Finalize one valid branch as new datoms; discard the rest.

This pattern is tractable because the proposal surface is restricted. The LLM does not generate free-form code. It emits structured datom tuples: the same five-position format the VM already executes. A restricted tuple structure makes LLM output validatable at the system boundary, and the same restriction is what makes the structure evolvable under selection, the way a fixed genetic alphabet enables mutation without destroying meaning.

This keeps creativity at the proposal layer and authority at the interpreter layer. It aligns with a simple rule: interpretation and execution remain separated by stream boundaries.

Immunity to Injection by Design

Prompt injection relies on a category error: the system confuses 'data' (the prompt) with 'instructions' (the code). LLMs inherently output text, so a parsing step always exists. The question is what the parser accepts.

In Yin.VM, the parser only accepts well-formed five-tuples with schema-validated positions. The attack surface collapses from 'arbitrary code execution' to 'craft a valid (e a v t m) tuple that passes schema checks.' An attacker cannot inject a shell command into a tuple position that expects a Transaction ID or a Keyword. The restriction of the five-tuple is the firewall.

Imagining the Next Layer

Once agents are continuations, new possibilities open up:

Agent marketplaces that exchange capability-scoped continuation templates
Multi-agent proofs where each claim is linked to stream provenance
Time-sliced governance: communities vote on which speculative branch to commit
Portable continuations: same agent logic running on different VM backends without rewriting

These are concrete deployment patterns enabled by the same substrate: continuations as data plus policy-checked execution.

Conclusion

Safety does not come from trusting the model. It comes from constraining execution and validating every transition.

If the future of AI is agentic, then the substrate matters. A continuation-native VM ensures that Agent Smith remains a useful guest in a pocket universe, rather than an unconstrained process in the host system. By making agency data-native, we gain a path where exploration is powerful, mobility is native, and safety is enforceable through verifiable runtime checks.