The Babashka Path: Why I Switched Yin.VM from Rust to Clojure

Published Dec 27, 2025

The Rust Mistake

I started Yin.VM in Rust. The reasoning seemed sound: building a high-performance virtual machine requires low-level control, memory safety, speed, and cross-platform portability. Rust offered all four. The problem? I'm not a Rust expert.

What followed was predictable: months spent fighting the borrow checker, struggling with lifetimes, reimplementing basic data structures, and learning a language while simultaneously trying to build a novel VM architecture. Progress crawled. The vision was clear, but the foundation was quicksand.

Meanwhile, Clojure—a language I'm actually competent in—sat unused. The persistent data structures, powerful macros, and excellent REPL-driven development workflow were all available. But the assumption held: "serious systems need low-level languages."

That assumption was wrong.

The Babashka Lesson

Babashka is a fast-starting Clojure scripting environment built by Michiel Borkent. At its core is SCI (Small Clojure Interpreter)—a Clojure interpreter written in Clojure that runs on the JVM.

On the surface, this seems circular: why implement Clojure in Clojure? Why not write the interpreter in C or Rust for maximum performance?

Because Babashka didn't need maximum performance. It needed:

Fast iteration - REPL-driven development beats compile cycles
Rich ecosystem - Leverage existing Clojure libraries
Maintainability - Write in a language the author knows deeply
Focus - Spend time on novel features, not reinventing wheels

By standing on the JVM's shoulders and writing in Clojure, Babashka achieved something remarkable: a production-ready tool with fast startup times, extensive library support, and a tiny, maintainable codebase. The constraint—using a "slower" language—led to better design decisions.

The LuxLang Parallel

LuxLang took the same path. It's a Lisp that compiles to multiple targets (JVM, JavaScript, Python, Ruby, Lua). Instead of building a custom runtime from scratch, it started by bootstrapping on the JVM.

The result:

Self-hosted compiler (Lux compiles itself)
Multiple backend targets without rewriting the frontend
Fast development velocity from day one
Production-ready within a reasonable timeframe

LuxLang proved that ambitious language projects don't need to start from bare metal. They need to start from solid ground.

My Pivot

After months of slow progress in Rust, the decision became clear: switch to Clojure and follow the Babashka path.

The new architecture:

Implement Yin.VM in Clojure - Leverage my actual expertise
Use Datascript for AST persistence - A mature datalog database instead of building DaoDB from scratch
Bootstrap from the JVM - Get a working system quickly
Progressive enhancement - Self-host later, once the core is proven

This isn't admitting defeat. It's admitting reality.

Why This Matters

1. Competence Beats Performance

A working system in a "slower" language I know beats a non-existent system in a "faster" language I don't.

Rust's performance advantage is meaningless if I spend 80% of my time debugging lifetime errors instead of implementing features. Clojure's slower execution speed is irrelevant when the REPL enables 10x faster iteration.

The bottleneck wasn't the language's runtime. It was my expertise.

2. Datascript Removes Uncertainty Cheaply

The core hypothesis of Yin.VM—that ASTs can be stored as queryable datoms—is unproven. Building DaoDB from scratch in Rust meant months of work before validating the idea.

Datascript changes the equation. It's a mature, battle-tested datalog database implemented in ClojureScript. It can:

Store AST nodes as entities with relationships
Query with datalog (the same query language I'm planning for DaoDB)
Run in-memory for fast iteration
Provide a compatibility target if I eventually build DaoDB

Within hours, not months, I can test the viability of "AST as datoms":

(require '[datascript.core :as d])

;; Define AST schema
(def ast-schema
  {:node/id        {:db/unique :db.unique/identity}
   :node/type      {}
   :node/parent    {:db/valueType :db.type/ref}
   :node/children  {:db/valueType :db.type/ref
                    :db/cardinality :db.cardinality/many}})

;; Create AST database
(def conn (d/create-conn ast-schema))

;; Store a function AST
(d/transact! conn
  [{:node/id "factorial"
    :node/type :lambda
    :node/params [{:node/type :param :node/name "n"}]
    :node/body {:node/type :if
                :node/condition {:node/type :call
                                :node/fn "="
                                :node/args ["n" 0]}}}])

;; Query for all function calls
(d/q '[:find ?call ?fn-name
       :where
       [?call :node/type :call]
       [?call :node/fn ?fn-name]]
     @conn)

If it works, I keep building. If it doesn't, I pivot quickly. Either way, discovery happens in days, not months.

3. Focus on What's Actually Novel

Yin.VM's innovation isn't:

A datalog database (Datomic, Datascript, and others exist)
Persistent data structures (Clojure has these)
A runtime (the JVM is mature and battle-tested)

The novel parts are:

Universal AST as the execution substrate - Code across languages shares a semantic foundation
Continuations as mobile agents - Computation moves to data
Queryable runtime state - Datalog introspection of live execution

By using Clojure and Datascript, I can focus 100% of my effort on these novel aspects instead of reimplementing infrastructure that already exists.

4. Self-Hosting Becomes Tractable

The path to self-hosting is the same regardless of implementation language. But getting there faster matters.

In Clojure, the self-hosting roadmap is straightforward:

Phase 1: Implement Yin.VM in Clojure, running on the JVM
Phase 2: Implement the Yang compiler (Clojure → Universal AST)
Phase 3: Use Yang to compile Yin.VM's own source code to Universal AST
Phase 4: Yin.VM interprets/compiles its own Universal AST representation
Phase 5: Bootstrap complete—Yin.VM runs itself

This is exactly how LuxLang achieved self-hosting. Start on a mature platform, build the compiler, then use the compiler to compile itself.

The Experimental Validation Plan

The Clojure + Datascript approach enables rapid experimentation. Here's my plan:

Phase 1: Validate Core Hypothesis (Weeks 1-4)

Experiment 1: Storage Overhead

Measure memory overhead of representing ASTs as datoms vs traditional tree structures. Use real-world code samples (Python stdlib, JavaScript libraries). Is 3x overhead acceptable? Where's the breaking point?

(defn compare-representations [ast]
  (let [tree-size (count (pr-str ast))
        datom-db (ast->datascript ast)
        datom-size (count (pr-str @datom-db))]
    {:tree tree-size
     :datom datom-size
     :ratio (/ datom-size tree-size)}))

Experiment 2: Query Expressiveness

Can I naturally express compiler analyses as datalog queries?

;; Dead code detection
(d/q '[:find ?node
       :where
       [?node :node/type :lambda]
       (not [?call :node/target ?node])]
     @conn)

;; Inline opportunities
(d/q '[:find ?fn
       :where
       [?fn :node/type :lambda]
       [?fn :node/size ?size]
       [(< ?size 10)]
       (single-use ?fn)]
     @conn)

If these queries feel awkward or perform poorly, that's critical feedback.

Experiment 3: Incremental Compilation

Can I express optimization passes as declarative rules?

(def constant-folding-rules
  '[[(constant-foldable ?node ?result)
     [?node :node/type :call]
     [?node :node/fn ?op]
     [(contains? #{+ - * /} ?op)]
     [?node :node/args ?args]
     (all-literal ?args)
     (compute ?op ?args ?result)]])

Phase 2: Build a Working System (Weeks 5-12)

Implement a minimal language (simple Lisp/Python subset) that:

Parses to Universal AST
Stores in Datascript
Renders to multiple syntaxes
Evaluates directly from AST
Supports lightweight continuations

Phase 3: Decide on DaoDB

After validation, the data will show whether I need DaoDB:

Build DaoDB if:

Storage overhead >5x becomes problematic
Query performance is a bottleneck
Distributed persistence is required (Datascript is in-memory only)
Append-only immutable history (like Datomic) is essential

Stick with Datascript if:

Overhead is acceptable
Performance is sufficient
In-memory storage works for my use cases
Maturity and community support matter more than custom optimization

If I build DaoDB later, Datascript becomes my compatibility baseline. The API will be compatible, making migration straightforward.

The Hard Lesson

Here's what the Rust experience taught me:

Ambitious projects fail not because the vision is wrong, but because the foundation is unsuitable.

My vision for Yin.VM was always solid: a VM where ASTs are queryable data, where continuations are mobile agents, where code is a universal semantic substrate. That vision didn't change.

What changed was accepting that building it in Rust—a language I'm not expert in—was slowing everything down. The "right" choice on paper (low-level control, performance) was the wrong choice in practice (unfamiliarity, slow iteration).

Switching to Clojure wasn't giving up on performance. It was prioritizing progress. And here's the surprising part: I might never need to rewrite in Rust.

LuxLang proved this. It started in Clojure on the JVM, achieved self-hosting, and now compiles to multiple targets—all without ever rewriting the compiler in a lower-level language. The Clojure implementation is the production implementation.

Could I circle back to Rust later? Maybe. But LuxLang demonstrates it might not even be necessary. Self-hosting from a Clojure base is entirely viable.

Why Babashka and LuxLang Got It Right

Both projects understood something fundamental:

The goal isn't to build everything from scratch. The goal is to build what's novel on top of what's proven.

Babashka didn't reimplement the JVM. It leveraged GraalVM's native-image and focused on fast-starting Clojure scripts.
LuxLang didn't build a custom runtime. It compiled to existing platforms (JVM, JavaScript) and focused on the language design.
Yin.VM doesn't need a custom datalog database yet. I can use Datascript and focus on AST execution semantics.

This isn't compromise. It's focus. Babashka is production-ready. LuxLang is self-hosted and still implemented in Clojure. Both achieved their goals by building on mature foundations—and neither needed to rewrite in C or Rust to be successful.

The Path Forward

My immediate work:

Implement Yin.VM core in Clojure
Store ASTs in Datascript with a clean abstraction layer
Build the Yang compiler (Clojure → Universal AST)
Validate the core hypothesis with real code samples
Measure everything: overhead, query performance, developer ergonomics

Within a month, I'll know if "AST as datoms" is viable. Within three months, I'll have a working prototype. Within six months, self-hosting becomes possible.

And then? The system might stay in Clojure indefinitely, like LuxLang. Or I might eventually port performance-critical components to Rust once I understand exactly where the bottlenecks are. The difference: I'll make that decision based on data from a working system, not speculation about what might be needed.

If I'd continued the Rust path? None of these milestones would be realistic. I'd still be fighting the borrow checker instead of implementing features.

The Meta-Lesson

Use the tools you know. Stand on platforms that work. Build only what's truly novel.

The temptation with visionary projects is to build everything from first principles. "If I'm building something revolutionary, shouldn't I use revolutionary tools?"

No.

Revolutionary ideas are hard enough. Implementing them with unfamiliar tools makes an already-difficult problem impossible.

Babashka used Clojure. LuxLang used the JVM. Yin.VM uses Clojure and Datascript. None of these choices diminish the novelty of what they're building. They enable it.

The fastest path to innovation isn't starting from zero. It's standing on the shoulders of giants and focusing your novel work on the pieces that truly need to be novel.

For Yin.VM, that means the Universal AST execution model, mobile continuations, and queryable runtime state. Everything else? Use what works.

Learn more:

AST Datom Streams: Bytecode Performance with Semantic Preservation (how AST datoms achieve bytecode-like performance)
Why yin.vm Succeeds Where Previous Attempts Failed (addressing skeptical objections to this architecture)
DaoDB (the eventual production database, built after validation)
Yin VM Documentation (technical deep dive)