DaoDB Beyond Covered Indexes: A Moduli Space of Databases

DaoDB is easy to picture as a Datomic-shaped database with a canonical set of covered indexes, EAVT, AEVT, AVET, VAET, and a Datalog query layer on top.

That framing was serviceable, but it was too narrow. It smuggled one particular implementation into the definition of the system.

The stronger framing is this: DaoDB is not a single database. It is a family of databases. What the family shares is not one fixed index set, but a common substrate and a common discipline: immutable datom streams, explicit causality, bounded snapshots, and interpreters that construct useful views over those streams.

DaoDB: here are events, and here are sanctioned ways to construct and traverse structures from them.

Once it is framed that way, several consequences follow immediately. Covered indexes stop being essential. Datalog stops being the obvious core. Querying starts to look less like sending a sentence to a logic engine and more like compiling an access plan against the materialized views a particular DaoDB instance actually exposes.

The Stream Is the Substrate

The stream remains the invariant. Datoms arrive in transaction order as [e a v t m]. That is the canonical serialization format for DaoDB, the common causal history, the thing every DaoDB in the family must be able to observe and replay.

But a stream alone is not yet a database. A raw stream is only sequence. It records what was appended and in what order. A DaoDB adds sanctioned interpretation: how to bound the stream, how to reconstruct a snapshot, how to derive structures, how to traverse those structures, and which views are available or can be derived on demand.

So the distinction is simple:

  • Raw stream: here are events in order.
  • DaoDB: here are events, plus a contract for constructing and traversing structures from them.

That contract is what makes DaoDB a family of databases rather than a storage format.

This model allows DaoDB to be understood as a kind of moduli space of databases. The canonical datom history stays fixed, but different DaoDB implementations occupy different points in that space by choosing different materialized views, maintenance strategies, physical layouts, and query surfaces.

One point in that space looks relational, with covered indexes and Datalog. Another looks document-oriented, with entity-centric materializations. Another looks columnar, time-series, graph-oriented, or logic-oriented. The substrate stays the same; what changes is the derived construction laid over it.

Covered Indexes Are Materialized Views

This was the key simplification. EAVT, AEVT, AVET, and VAET are not the essence of DaoDB. They are materialized views over the datom stream.

That sounds almost obvious once stated, but it matters. If those indexes are materialized views, then a DaoDB implementation is free to expose them, omit them, or replace them with other structures more appropriate to its workload.

A default DaoDB implementation may still choose the familiar covered indexes because they make a Datalog-flavored relational surface practical and fast. But another DaoDB might expose:

  • entity-centric maps
  • AST child and parent tables
  • graph adjacency structures
  • content-hash indexes
  • ordered positional collections
  • capability or provenance projections

All of these are still DaoDBs if they share the stream substrate and the view contract. They are different interpreters over the same kind of reality, different points in the same moduli space of admissible database designs.

What This Means for Datalog

If covered indexes are optional, then Datalog can no longer define the whole family. Datalog assumes a relational observation surface. It does not require one exact physical layout, but it does require that the database can present its facts or derived tuples as relations that support joins, constraints, and fixed-point evaluation.

That remains extremely valuable. Datalog is still Datalog because of its semantics, and DaoDB should absolutely keep a Datalog layer where relational views are available.

But the role changes. Datalog becomes:

  • an optional relational query surface
  • a compatibility layer for databases that expose relational views
  • one front-end syntax among several possible front-ends

It is no longer the essence of DaoDB. The essence is lower-level: datom streams plus sanctioned structural interpretation.

This is not a rejection of Datalog. It is a clarification. Datalog is not removed. It is demoted from ontology to capability.

Why Traversal Looks More General

Once DaoDB is a family of databases with heterogeneous materialized views, traversal starts to look like the more general core query surface.

Relational queries ask: which tuples satisfy these joins? Traversal queries ask: how do I move through this structure? When the exposed view may be a graph, tree, ordered sequence, entity map, or hybrid structure, traversal is often the more natural primitive.

A useful comparison point is that a DaoDB structural API may feel closer to Specter, Meander, or even clojure.walk than to a pure Datalog engine.

The reason is simple. Most exposed structures can be treated as sequential and/or associative at the host-data level. That makes a Specter-like navigation algebra a plausible universal substrate for moving through materialized views.

But plain host traversal is still not enough. DaoDB views carry semantics that ordinary nested maps and vectors do not:

  • snapshot and history boundaries
  • reference traversal versus key lookup
  • ordering guarantees
  • multiplicity and optionality
  • materialized versus derived-on-read execution
  • capability and cost differences between views

So the right conclusion is not that DaoDB should literally reuse raw Specter paths over arbitrary Clojure values. The right conclusion is that DaoDB should adopt a Specter-like traversal model over declared view capabilities.

Named Views, Not Infinite Magic

At this point an important boundary appears. A structural view API is not a general way to navigate an infinite space of all imaginable materializations. It is a uniform way to navigate the sanctioned subset that a DaoDB implementation declares.

In other words, the API should answer questions like:

  • Which views are available?
  • Which views can be derived on demand?
  • Is a view relational, graph-like, tree-like, associative, sequential, or mixed?
  • What are the legal traversal steps for that view?
  • What are the historical semantics of traversing it under as-of or since?
  • What is materialized, and what is compiled or derived lazily?

This keeps the system honest. It preserves explicit causality and explicit capability. Nothing magical happens implicitly. A query can trigger on-demand view derivation, but only through a declared interpreter with known semantics.

From Querying to Query Compilation

The central question is not, What is DaoDB's one true query language?, but rather, What is DaoDB's query compiler pipeline?

That shift matters. It moves us away from the idea of sending a finished query to a fixed engine. Instead, the user expresses intent, the system chooses an access strategy based on available views, and execution runs as a compiled plan.

The proposed split is elegant:

  • Meander as optimizer and front-end: inspect query shape, inspect available view capabilities, rewrite intent into a logical and physical plan.
  • Specter as instruction set and back-end: execute the compiled traversal path with low runtime overhead.

In this framing, Meander is the planner. Specter is the executor. Querying becomes compilation.

A Compiler Pipeline for DaoDB

The full pipeline can be sketched like this:

user intent
  -> logical rewrite
  -> capability-aware optimization
  -> physical path plan
  -> execution against materialized or derived views

Or, in DaoDB terms:

  1. The caller states what they want, not how to fetch it.
  2. The planner inspects the DaoDB implementation's declared views and capabilities.
  3. Rewrite rules choose an efficient access strategy.
  4. The strategy lowers into a Specter-like path IR.
  5. An executor runs that plan against a materialized view or triggers an on-demand derivation.

This gives us the right separation of concerns:

  • intent is declarative
  • optimization is explicit
  • execution is structural
  • view choice is capability-aware
  • different DaoDB implementations can share the compiler while exposing different physical views

A Concrete Example

Suppose a caller wants to find a user by email. In the old mental model, that immediately sounded like Datalog over AVET. In the new model, it becomes an optimization problem.

{:find [:entity]
 :where [[:attr= :user/email "alice@example.com"]]}

The planner now asks what kind of DaoDB it is dealing with.

  • If the database exposes a covered AVET view, compile to a direct indexed lookup.
  • If it exposes an entity projection but no AVET, compile to a scan over the relevant entities.
  • If it exposes a specialized identity or content-address view, compile to that instead.
  • If no view exists yet, derive one on demand if the capability contract allows it.

Same intent. Different compiled access plans. Same family. Different database.

This is why calling DaoDB a family of databases is not just rhetoric. It changes the meaning of query execution.

Why This Fits Datom.World

This redesign fits the existing philosophy unusually well.

  • Interpretation over abstraction: views are interpreters over the stream, not hidden objects pretending to be reality.
  • Restrictions are a feature: only declared views and declared traversals exist.
  • Explicit causality: historical semantics belong to the view contract, not to undocumented runtime behavior.
  • Everything is a stream: the stream remains canonical, while views are derived constructions, sometimes projections, sometimes higher-dimensional interpretations.
  • Do not assume graphs: graph structure becomes one possible materialized interpretation, not an implicit truth.

It also clarifies a category of implementation failures. Many bugs in index maintenance were not really "database corruption" in the deepest sense. They were materialized-view maintenance bugs: stale caches, missing historical rebuilds, wrong membership in secondary indexes. This reframing makes those failure modes easier to name and easier to isolate.

What Stays the Same

The redesign does not discard what was already good.

  • The default DaoDB can still ship with Datomic-style covered indexes.
  • Datalog remains a powerful relational front-end where those views exist.
  • Snapshot semantics, transaction semantics, and bounded history remain central.
  • The in-memory DaoDB remains a useful reference implementation.

What changes is the level at which the system is defined. DaoDB stops being defined by one query language and one index family. It is instead defined by the shared stream substrate and the contract for producing and traversing views.

The New Center of Gravity

So the center of gravity moves:

  • from fixed indexes to optional materialized views
  • from one database to a family of databases
  • from mandatory Datalog to optional relational capability
  • from runtime query interpretation to compiled access plans
  • from hidden storage assumptions to explicit view capabilities

That is a healthier architecture. It is more general without becoming vague. It preserves the stream as the invariant and lets structure emerge through constrained interpretation.

DaoDB does not need one universal physical layout. It needs a canonical stream, a disciplined view contract, and a compiler that can lower intent into executable plans against whatever structures a particular DaoDB chooses to expose.

That is the redesign: not a retreat from DaoDB, but a sharper statement of what it has been trying to be all along.