AST as Higher Dimensional Construction of Datom Streams
Everything Is a Datom Stream
In datom.world, everything is a datom stream. Code, data, execution state, transformations—all are streams of immutable facts flowing through time.
A datom is an atomic fact: [entity attribute value transaction]. Immutable. Timestamped. Queryable.
Traditional Abstract Syntax Trees (ASTs) are trees. Static snapshots of code structure frozen in time. But what if we reconceive the AST not as a static tree, but as a materialized view over datom streams? What if the tree structure you see is just one projection of facts flowing through multiple dimensions simultaneously?
This is the core architectural principle of Yin.vm: ASTs are multidimensional materialized views of datom streams.
The Five Dimensions of AST Datoms
When you represent an AST as datoms, you unlock five fundamental dimensions:
1. Spatial Dimension: Structure
The traditional tree structure—parent-child relationships—becomes a graph of datoms:
[node-1 :ast/type :function-call]
[node-1 :ast/name "calculateTotal"]
[node-1 :ast/parent node-0]
[node-1 :ast/children [node-2 node-3]]
[node-2 :ast/type :variable]
[node-2 :ast/name "price"]
[node-3 :ast/type :literal]
[node-3 :ast/value 42]Each node is an entity. Relationships are facts. The entire tree is queryable.
2. Temporal Dimension: Evolution
Every transformation of the AST is a transaction. The entire history is preserved:
; Transaction 100: Original parse
[node-1 :ast/type :variable :tx 100]
[node-1 :ast/name "x" :tx 100]
; Transaction 101: Type inference
[node-1 :type/inferred :string :tx 101]
; Transaction 102: Scope analysis
[node-1 :scope/binding scope-5 :tx 102]
; Transaction 103: Runtime evaluation
[node-1 :exec/value "hello" :tx 103]You can query: "Show me how this node changed from parse to execution."
3. Type Dimension: Static to Dynamic
The industry treats static vs dynamic typing as a binary choice. But type information actually exists on a continuum of certainty—how confident we are that a value has a particular type:
[node-7 :type/declared :int :tx 100] ; High certainty (declared)
[node-7 :type/inferred :number :tx 101] ; Medium certainty (inferred)
[node-7 :type/runtime java.lang.Integer :tx 102]; Low certainty (discovered)
[node-7 :type/certainty :static :tx 100] ; Certainty levelCertainty examples:
x: int(high certainty—programmer declared)result = [](medium certainty—inferred from usage)json.loads(input)(low certainty—only known at runtime)eval(user_input)(no certainty—could be anything)
Yin.vm unifies static and dynamic types by treating certainty as a property, not a language category. A map operation has the same semantics whether it's in Python (low certainty) or C++ (high certainty). The type system becomes metadata on nodes.
Because the AST is canonical code, not bytecode, this works. The AST node [node-42 :ast/type :map-operation] represents the semantic operation. The type certainty [node-42 :type/certainty :static] is just metadata annotating how much we know. This blurs the line between static and dynamic—they become different annotations on the same canonical code.
This reveals that the line between static and dynamic was always artificial. Types are facts about certainty, not fundamental differences in semantics. Both coexist as datom streams in the same AST.
4. Language Dimension: Cross-Language Transformations
When you translate code between languages, the datom stream preserves the transformation lineage:
; Original Python AST node
[node-42 :ast/source-lang "Python" :tx 200]
[node-42 :ast/syntax "list_comprehension" :tx 200]
; Transformed to C++
[node-43 :ast/source-lang "C++" :tx 201]
[node-43 :ast/syntax "range_for_loop" :tx 201]
[node-43 :ast/transformed-from node-42 :tx 201]
; Semantic equivalence preserved
[node-42 :ast/semantics "map_operation" :tx 200]
[node-43 :ast/semantics "map_operation" :tx 201]You can query: "Find all Python comprehensions that became C++ loops."
5. Execution Dimension: Code to Continuation
As code executes, runtime state becomes part of the datom stream:
; Function definition (static)
[fn-1 :ast/type :function :tx 100]
[fn-1 :ast/params [param-1 param-2] :tx 100]
; First invocation (runtime)
[call-1 :exec/function fn-1 :tx 300]
[call-1 :exec/args ["alice" 25] :tx 300]
[call-1 :exec/stack-frame frame-1 :tx 300]
; Paused as continuation
[call-1 :exec/continuation cont-1 :tx 301]
[cont-1 :exec/state :suspended :tx 301]
[cont-1 :exec/instruction-pointer 15 :tx 301]
; Migrated to another machine
[cont-1 :exec/migrated-to "host-2" :tx 302]
[cont-1 :exec/state :resumed :tx 303]The entire execution trace is a stream of facts. Continuations are queryable data.
ASTs as Materialized Views
Here's the key insight: an AST is not the source of truth. The datom stream is.
The AST you see—the tree structure with parent-child relationships—is a materialized view that projects the spatial dimension of the datom stream. But the same datom stream contains four other orthogonal dimensions:
- Structure — graph topology (what we traditionally call the "AST")
- Time — evolution through transactions
- Types — certainty metadata
- Language — cross-language transformations
- Execution — runtime state
You can materialize different views over the same datom stream:
- A spatial view → traditional AST tree
- A temporal view → evolution history of a node
- A type view → certainty graph across the codebase
- A language view → transformation lineage
- An execution view → runtime call graph
Because these dimensions are orthogonal and stored as datoms, you can query any slice. Any combination of dimensions. Any point in time.
Querying Across Dimensions
Because the AST is a materialized view over datom streams, you can query any combination of dimensions:
Spatial + Temporal: "How did this node evolve?"
[:find ?attr ?value ?tx
:where
[?node :ast/id "node-42"]
[?node ?attr ?value ?tx]]Shows the complete evolution of a single AST node across all transactions.
Type + Execution: "Where do certainty boundaries cross?"
[:find ?fn ?arg
:where
[?fn :type/certainty :static] ; High certainty function
[?call :exec/function ?fn]
[?arg :exec/parent ?call]
[?arg :type/certainty :dynamic]] ; Low certainty dataFinds where low certainty data (dynamic, runtime-discovered) flows into high certainty functions (static, compiler-enforced). Example: charge_card(json.loads(input)['amount']) where the function expects a declared type but receives runtime data.
This query is impossible in traditional VMs that erase one dimension or the other. Static languages lose "this came from dynamic source." Dynamic languages don't track "this function expects static guarantees."
Language + Time: "How did code transform across languages?"
[:find ?node ?lang ?tx
:where
[?node :ast/transformed-from ?previous]
[?node :ast/source-lang ?lang ?tx]
[(> ?tx 1000)]]Traces transformation lineage with timestamps. Shows when Python became C++.
The Key: Orthogonality
These queries work because dimensions are orthogonal. The same datom can have:
- A position in the structure graph
- A transaction timestamp
- A type certainty level
- A source language
- An execution state
All stored as facts. All queryable. All composable.
Why Datom Streams?
Why build ASTs this way? Because when everything is a datom stream, unexpected capabilities emerge:
- Time-travel — Query any historical state by filtering on transaction
- Continuations — Execution state is just another dimension, serializable as datoms
- Cross-language migration — Semantic meaning preserved across transformations
- Live introspection — Query running programs as databases
- Perfect auditability — Complete provenance of every transformation
None of these require special infrastructure. They're natural consequences of treating ASTs as materialized views over datom streams.
The Implementation
Yin.vm implements this using DaoDB, a Datalog database built for datom.world. The Yang compiler parses code and emits datom transactions. Each parse, each type inference, each transformation, each execution step adds datoms to the stream.
The AST you see is a materialized view. A Datalog query. Project different dimensions, get different views. Same underlying datom stream.
Conclusion: Everything Is a Datom Stream
The core insight of datom.world: everything is a datom stream.
Code is not text. It's not a tree. It's a stream of immutable facts flowing through time and across dimensions. What we call an "AST" is just one materialized view over this stream—a projection of the spatial dimension.
But the same stream contains:
- Evolution history (temporal)
- Type certainty (type dimension)
- Cross-language transformations (language dimension)
- Runtime execution state (execution dimension)
By storing these as datoms, we can materialize any view. Query any slice. Compose any dimensions. The AST becomes queryable, portable, and alive.
This is not just a better data structure. It's a fundamental reconception: code as datom streams, ASTs as materialized views. This is the architectural foundation of datom.world.
Learn more:
- Yin.vm: Chinese Characters for Programming Languages (how the Universal AST enables cross-language interoperability)
- DaoDB (the Datalog database powering datom streams)
- Yin VM Documentation (technical deep dive)
- Yang Compiler (how Clojure code becomes datom streams)
- GitHub Repository (explore the implementation)