Russian Dolls: The Fractal Symmetry of Yin.VM and LLMs
If you look closely at the architecture of a secure virtual machine and the architecture of a Large Language Model interacting with tools, you will find they are structurally identical.
Both are, at their core, isolated reasoning engines that cannot natively touch the outside world. They are locked in a room, processing symbols. For the Yin.VM, those symbols are AST datoms being reduced in a CESK (Control, Environment, Store, Continuation) machine. For an LLM, those symbols are tokens being predicted by transformer weights. Neither engine has any native concept of a file system, a network socket, or a database.
The Boundary and the Emission
How, then, do these engines do useful work? They rely on a host environment, and they communicate with that host across a strict boundary using data effects.
When code (compiled via Yang) executing inside Yin.VM wants to fetch a webpage, it halts execution. It cannot execute an HTTP request itself. Instead, it emits a data structure representing its intent:
{:effect :io/http
:url "https://example.com"}This is a platform-agnostic, pure data value. The Clojure, Dart, or JS host intercepts this map, executes the side effect using its native networking libraries, and feeds the resulting bytes back to the VM, un-parking (resuming) the suspended continuation so the agent can proceed.
Now, consider how an LLM agent (like Agent Tzu) uses tools via the OpenAI-compatible Chat Completions API. When the LLM decides it needs to fetch a webpage, it halts generation and emits a JSON structure representing its intent. While the LLM emits JSON rather than the EDN map shown above, the mechanism is the same:
{
"name": "http_fetch",
"arguments": "{\"url\": \"https://example.com\"}"
}The agent loop intercepts this JSON, executes the side effect on the host machine, and appends the result to the context window as a "role": "tool" message, resuming the LLM's inference.
Interpretation > Abstraction
Traditional software architecture is built on abstraction. When you write a function like (fetch-url "https://example.com"), the intent and the execution are collapsed into a single, opaque black box. You cannot easily serialize that function call, send it across a network, or pause it mid-flight. The causal link between the call and the result is hidden inside the language runtime's call stack.
In datom.world, we favor interpretation. By reifying intent as observable data, a glass box, we keep the system's machinery visible. The agent doesn't "call a function"; it emits a fact about what it wants to do. This separation of intent from implementation is what allows the Russian doll stack to function: each layer is a specialized engine designed to process the symbols of the layer beneath it.
Stigmergy at Multiple Scales
The architectural pattern here is stigmergy: coordination without direct communication, achieved by modifying a shared environment. Everything is data, and explicit causality is favored over implicit assumptions.
Because of this symmetry, when we port an LLM agent like Agent Tzu to run inside the Yin.VM, we create a layered chain of interpreters:
- The host OS (Clojure/JS/Dart) interprets the Yin.VM data effects (performing actual network I/O).
- The Yin.VM interprets the Universal AST datoms (suspending and resuming continuations based on the host's responses).
- The Agent Logic (compiled to AST) interprets the LLM's JSON tool calls.
- The LLM (on the other side of the API boundary) interprets the human's prompt.
At every boundary layer, the rule is the same: communication happens only by emitting data for an outer interpreter to act upon. The LLM isn't "calling a function"; it is generating a projection of its internal state into a defined coordinate system. The Yin.VM isn't "executing IO"; it is yielding a data structure.
The Power of the Sandbox
This strict adherence to Interpretation > Abstraction guarantees that the agent's network requests, file reads, and API calls are asynchronous, suspendable, and perfectly observable data effects. If a process crashes while waiting for the LLM to reply, the agent's continuation sits safely in the database, ready to be resumed later, or even migrated to another node.
By aligning the architecture of our agents with the architecture of our virtual machine, we ensure that the entire stack remains a glass box. As intelligence scales, that intelligence remains grounded in explicit, causal data streams.