Code as Entropy: Why Good Software Evolves, Not Designed
What makes code "good"? A friend once told me he doesn't know but bad code is code that other developers write. The joke is funny because it's true, and it's true because every developer has a different mental model. What's obvious to you is cryptic to me. What's elegant in your head is spaghetti in mine.
Most developers will list qualities like readability, maintainability, or elegance. But these are artifacts of good code, not descriptions of how to achieve it. They emerge when you follow the right evolutionary process, but they're subjective, dependent on those differing mental models.
The deeper more fundamental question is: what makes code capable of change? Because requirements always change, and code that can't adapt is dead code.
It took me decades of working with bad code, and writing plenty of bad code myself, to distill the simple principles I present here. The answer lies in physics. Specifically, in entropy, dependency graphs, and evolution. Good code isn't designed, it evolves through a dance with entropy.
Here's the nuance: all code increases in entropy over time. This mirrors the 2nd law as an analogy: without sustained work, systems drift toward disorder. Biological cells bound complexity by locally decreasing entropy (maintaining internal order) while pushing entropy outside their boundaries into the environment*. Good code does the same: it minimizes entropy within each module while increasing entropy of the whole system.
This is the key insight of biology and code: distribute complexity outward (more modules, more namespaces) while keeping each module internally coherent. The system becomes more entropic, but each piece stays organized. This is what makes biology and code adaptable.
The Only Objective Standard: Malleability
Code quality is subjective. What's "clean" to one team is obtuse to another. Shared mental models help (Domain Driven Design attempts this), but teams still disagree.
There is, however, one objective standard: good code is code that adapts to changing requirements with low cost. Bad code is rigid: small changes cascade through the system, forcing rewrites instead of refinements.
This isn't about aesthetics. It's about adaptability and survival. Software that can't change dies when the world around it shifts. Malleability is the only fitness function that matters. (We'll look at how to measure this via dependency graphs later.)
High Cohesion and Loose Coupling: The Core Principle
Kent Beck spent 17 years learning to explain cohesion and coupling†. The principle is simple:
- High cohesion , things that change together should live together
- Loose coupling , things that change independently should be isolated from each other
When modules are tightly coupled, a change in one module ripples through all its dependents. You can't modify behavior without understanding the entire dependency graph. The cost of change becomes prohibitive.
When modules are loosely coupled but internally cohesive, changes are local. You modify what needs to change without touching anything else. The system adapts without rewriting.
This is measurable. Draw your dependency graph. If it looks like spaghetti, your code is rigid. If it looks like cleanly separated clusters with minimal cross-connections, your code can evolve.
Bounded Complexity: The Goal of Good Design
If cohesion and coupling are the principles, bounded complexity is the goal. Bounding complexity is what makes good code good code.
Bounded complexity allows a programmer to quickly learn the code. When you bound the complexity of a module, you create a sanctuary for the developer, a finite space where they can hold all the necessary context in their head at once. You don't need to understand the entire universe to fix a bug in the validation module.
Crucially, bounded complexity means changes stay local. If the complexity of a module is strictly bounded, the impact of any change remains local. The ripple effects are contained, preventing the dreaded cascade of breakage that defines bad code.
Why Spaghetti Code is the Default State
By analogy to statistical mechanics: there are more ways to write tightly coupled code than loosely coupled code.
Every function you write creates potential dependencies. Without deliberate effort, those dependencies accumulate randomly. The probability space of "spaghetti architectures" vastly exceeds the probability space of "clean architectures."
This is entropy at work. In information theory, entropy measures how many bits of information you need to describe a state. A high-entropy system has many degrees of freedom, many possible configurations, requiring more information to specify exactly what is happening.
In code, this translates directly to cognitive load. High entropy means a programmer needs more bits in their head to understand the system. Tight coupling (spaghetti) has vastly more possible configurations than loose coupling (clean layers). Without deliberate effort, systems drift toward high entropy because there are simply more ways to be messy than organized.
Additionally, premature naming constrains your design. When you name a module before you understand its boundaries, you lock in assumptions. Structure should guide naming, not the other way around.
Example: You create a user-service namespace early in the project. Later you realize some functions deal with authentication, others with profiles, others with permissions. But the name user-service doesn't reflect these distinctions. Now you face a choice:
- Refactor into
auth,profiles, andpermissions(correct but requires renaming and moving code) - Keep everything in
user-service(easy but hides the real structure)
The premature name created resistance to the right structure. If you'd started with core or users and let the cohesive clusters emerge, the structure would have guided you toward auth, profiles, and permissions naturally.
Frameworks like Ruby on Rails make this worse by forcing everything into MVC: models, views, controllers. This is solution domain vocabulary, not problem domain vocabulary. Your e-commerce app doesn't naturally think in "controllers", it thinks in orders, payments, inventory, shipping. But Rails forces you to organize by technical pattern instead of business concept.
The result: OrdersController, PaymentsController, InventoryController, where the suffix "Controller" adds no information and the real domain concepts (orders, payments, inventory) are buried in a solution-domain structure. You've let the framework name your modules before you understand the problem.
Entropy in Software: Distribute to Organize
Here's the concrete measure of entropy in a codebase: how many files must you open to understand or modify a feature?
If you need to read 10 scattered files, that's high entropy, you need a lot of context (information) to reconstruct what's happening. If one cohesive module contains everything, the information is localized and the entropy is low.
Two scales matter here: system entropy (how many global configurations exist) and local entropy (how much context one change requires). Good architecture accepts higher system entropy while reducing local entropy.
The counterintuitive consequence is: localized order requires global distribution.
As a system grows, a single "simple" module eventually becomes an entropic mess of internal configurations. The biological solution is to increase system entropy while decreasing module entropy.
You start with a single namespace. As it grows, you identify cohesive clusters and extract them into separate namespaces. This increases the entropy of the whole system (more namespaces, more connections) while decreasing the entropy of each module (each remains internally coherent and bounded).
This is evolution in action: the system becomes more complex (higher system entropy) so that the individual pieces can stay organized (lower module entropy).
;; Phase 1: Everything in one namespace
(ns myapp.core)
(defn fetch-user [id] ...)
(defn save-user [user] ...)
(defn validate-email [email] ...)
(defn send-notification [user] ...)
;; Phase 2: Cohesion emerges, extract clusters
(ns myapp.users)
(defn fetch [id] ...)
(defn save [user] ...)
(ns myapp.validation)
(defn email [email] ...)
(ns myapp.notifications)
(defn send [user] ...)Notice: we didn't start with a grand architecture. We let the code tell us where the boundaries are. This is evolution, not intelligent design.
Evolution Requires Selection Pressure: Three Driving Forces
In biology, evolution works because natural selection kills unfit organisms. In software, evolution works because three selection pressures kill unfit code:
- Automated tests : code that breaks contracts doesn't survive. Tests verify correctness.
- Developer cognitive load : code that takes too long to understand or modify doesn't survive. Bounded complexity is the antidote to cognitive load. If it takes developers forever to learn the codebase, it's bad code because it makes change expensive. High cognitive burden is a selection pressure against that code.
- Business economics : code that's too expensive to maintain doesn't survive. Here's the irony: well-funded projects can sustain high entropy for a long time, allowing code rot to accumulate unchecked. Under-funded projects face immediate death if they don't adhere to good principles, they're forced to maintain low module entropy to survive. Economic constraint is a selection pressure that well-funded teams often lack until it's too late.
All three pressures select for the same property: malleability. Code with high cohesion and loose coupling passes all tests: it's easy to verify, easy to understand, and cheap to change.
Without these pressures, you have no feedback loop. Changes accumulate with no verification. Code drifts toward rigidity because there's nothing selecting for malleability.
So how do you create these pressures in practice? The Clojure REPL provides the ideal environment: low-cost experimentation lets you try variations quickly, observe their behavior, discover what works, and formalize the winning patterns as tests. This is adaptive evolution, not waterfall planning, you're creating selection pressure in real-time.
;; In the REPL: experiment
(comment
(let [user {:email "invalid"}]
(validate-email (:email user)))
;; => false, good
(let [user {:email "valid@example.com"}]
(validate-email (:email user)))
;; => true, good
;; Formalize as test
(deftest email-validation
(is (false? (validate-email "invalid")))
(is (true? (validate-email "valid@example.com"))))
)These three pressures work together. Code that breaks tests doesn't survive. Code that's too cognitively expensive to modify doesn't survive. Code that's too expensive to maintain doesn't survive. Code that passes all three tests, correct, understandable, and cheap to evolve, survives and reproduces (gets reused).
The Brittleness Paradox: When Tests Become Technical Debt
Tests are selection pressure, but only when they guard promises. When they guard internals, they increase coupling and slow evolution.
Test the Contract, Not the Implementation
If Module A depends on Module B, the boundary between them is the contract. Test that boundary. Internal refactors should stay free as long as boundary behavior holds.
;; Stable: test the public contract
(deftest parse-api-contract
(is (= {:tokens ["foo" "bar"]}
(parse "foo bar"))))
;; Temporary scaffold during development
(deftest parse-token-scaffold
(is (= [:token "foo"]
(#'myapp.core/internal-parse-token "foo"))))The goal is to keep dependencies explicit and cheap to change: contracts are stable, internals are flexible.
When to Test Internals (Temporarily)
Sometimes a module is complex enough that you need temporary confidence in internal behavior while building it. In those cases, write scaffolding tests to explore edge cases, then delete or demote them once the public boundary stabilizes.
If you must reach into a namespace to test a private function (defined with defn-), Clojure lets you access the var directly using the #' reader macro. This makes boundary crossing explicit without promoting internals to public API.
Exploration Knowledge: Keep It While True
REPL (comment ...) blocks are useful for local reasoning and handoff. Treat them as ephemeral: keep them while accurate, delete them when they become misleading.
Rule of thumb: automate what must remain true in CI, and remove stale exploratory notes aggressively.
The Core Promise
The measure of success isn't "how many tests do you have." It's: how cheaply can you change without breaking promises?.
Test the contracts between modules. Preserve exploration knowledge in comment blocks. Let internals evolve freely. The system stays malleable because tests guard promises, not implementation.
Practical Workflow: From Chaos to Structure
Here's the process distilled:
- Start messy , Write everything in one namespace. Don't prematurely organize.
- Experiment in the REPL , Use
(comment ...)blocks to explore behavior and discover edge cases. - Observe cohesion , As code grows, notice which functions change together.
- Extract clusters , Move cohesive groups into separate namespaces.
- Minimize coupling , Reduce dependencies between namespaces.
- Test interfaces , Write tests for the contracts between modules, not internal implementation details.
- Commit comment blocks , Preserve REPL experiments as runnable documentation.
- Delete ruthlessly , If a comment block becomes a lie, delete it immediately. Don't let stale knowledge rot.
- Iterate , Repeat as requirements change.
This isn't top-down design. It's bottom-up evolution guided by feedback loops. Structure emerges from observing the system's natural boundaries.
Dependency Graphs: Visualizing the Fitness Function
If malleability is the North Star, how do we measure our progress?
Here's a remarkable property of dependency graphs: you can assess architectural coupling health by visual inspection alone, without understanding a single line of code.
A good dependency graph looks layered, clean separation between modules, minimal cross-connections, clear flow from low-level primitives to high-level features. You can see the structural quality at a glance.
A bad dependency graph looks like spaghetti, dense tangles, circular dependencies, everything connected to everything. The visual chaos reflects the architectural decay.
This is objective structural data. You don't need to read a line of code. The graph shape tells you:
- Spaghetti (densely tangled) → changes cascade everywhere, rigid code, high coupling
- Layered (clean strata) → changes are local, malleable code, loose coupling
- Clustered (islands with bridges) → cohesive modules with minimal inter-module dependencies
Tools like lein-ns-dep-graph or clj-depend (Clojure), madge (JavaScript), or pydeps (Python) can generate these visualizations. The graph doesn't lie: tight coupling is visible as tangled edges, loose coupling as clean separation.
Use this as your architectural fitness metric. If the graph is improving (clearer layers, fewer tangles), your code is evolving toward health. If it's degrading (more tangles, circular dependencies), you're accumulating technical debt.
The beauty: you can review a PR's structural impact without understanding the domain logic. Does it add more tangles or clean up the structure? The visual tells the structural story.
Why This Matters for Datom.World
Datom.World's architecture isn't just inspired by the entropy management philosophy, it reifies these principles as infrastructure. Where traditional codebases fight entropy through discipline and testing, datom.world embeds entropy management into the substrate itself. This realizes the core evolutionary strategy: by fixing the fundamental constraints, the invariant 5-tuple datom, we create the stable environment where behavior can evolve through stream history and open interpretation.
Quick orientation for readers new to datom.world: DaoDB is the database engine, Yin.VM is the evaluator, DaoFlow is the UI/runtime layer, and Shibi is the capability token system. The details differ, but each component interprets the same datom stream.
Immutable Streams: Entropy Distribution by Design
The central insight above is: distribute complexity outward (more modules) while keeping each piece internally coherent. Datom streams are the architectural embodiment of this principle:
- System entropy increases naturally , append-only streams grow over time, accumulating more datoms and more historical information, this is entropy distribution at its ultimate scale
- Module-level entropy stays minimal , the datom (data atom) is the ultimate invariant of information in datom.world. Its fixed 5-tuple structure
[e a v t m]eliminates structural variability. By restricting the physical substrate to five positions that never change, we reduce internal module entropy to the minimum while preserving system-level expressive power - Loose coupling via interpretation , producers and consumers interact purely through the invariant datom shape. Since there are no schema contracts at the physical layer, consumers extract different meanings from the same stream without coordination
- Minimal API surface , inspired by Plan 9's "everything is a file," datom.world has streams as the only API with just two functions:
readandwrite. Fewer operations mean fewer physical coupling points - Evolution becomes queryable , time-travel via streams means architectural evolution isn't hidden in git history, it's a Datalog query
This inverts the typical problem. In traditional code, you fight entropy accumulation. In datom.world, entropy accumulation is the feature, the stream wants to grow, and that growth is harnessed rather than resisted.
Streams achieve ultimate loose coupling: producers and consumers know nothing about each other except through the invariant 5-tuple datom. The producer writes datoms without knowing who will read them or how they'll be interpreted. The consumer reads datoms without knowing who wrote them or why. They interact purely through the substrate, with no contracts beyond the datom shape itself at the physical layer.
This allows semantic contracts to be defined purely as queries. This is fundamentally different from traditional APIs where the producer (server) and consumer (client) are tightly bound through physical interface signatures. Change the record shape, and you break every consumer. In datom.world, interpretation creates meaning. The query is the promise, decoupled from how the data is stored.
- One consumer sees user activity events for analytics
- Another sees the same datoms as an audit log
- A third interprets them as training data for ML models
- All from the same stream, no coordination required
This pattern is familiar in Clojure: the reader interprets character streams as data, then evaluation interprets data as executable behavior. Datom.world extends the same layered interpretation pattern across the entire stack, from language to storage to runtime.
Streams of datoms are the universal substrate. DaoDB interprets the stream as indexes. Yin.VM interprets it as execution. DaoFlow interprets it as UI. AI agents interpret it as training data. Interpretation layers extract meaning, and no single consumer owns the semantics.
Like Plan 9's uniform file interface that made diverse resources composable through a tiny API (open, read, write, close), datom.world's two-function stream interface makes everything composable while minimizing the surface where tight coupling can form. But streams go further: they decouple not just the mechanism of access but the meaning extracted from it.
In evolutionary terms, this makes adaptation cheap: new interpreters can compete on usefulness without forcing rewrites in producers.
Continuations: Loose Coupling as a Runtime Guarantee
We argued earlier that "things that change independently should be isolated." Continuations enforce this architecturally, not culturally:
- No shared mutable state , continuations can share data through streams, but they cannot share mutable state. All communication happens through immutable datoms, making dependencies explicit and side effects visible
- Mobile computation , Yin.vm's CESK machine unifies functions, closures, continuations, and eval into a single evaluator. This means continuations are first-class values that can migrate between nodes without coupling to the host environment, computation moves to data, not the other way around
- Capability-based security , access is controlled through Shibi, datom.world's capability token system. If a continuation holds a Shibi token for a stream, it can access that stream, otherwise it cannot. Stream access = dependency, and the coupling is visible (you can inspect which Shibi tokens a continuation holds) and controllable (tokens can be granted, revoked, or delegated)
Continuations can share data, they just do it through immutable streams rather than shared memory. This makes coupling explicit: if two continuations need to coordinate, they exchange datoms through a known stream. You can't accidentally create tight coupling when all communication must flow through data, not mutable references. The architecture enforces the dependency hygiene that most codebases achieve only through code review discipline.
This is runtime selection pressure: continuations that preserve explicit boundaries remain composable, while hidden coupling paths are structurally harder to create.
Single Source of Truth: Solving the Entropy Crisis
Earlier we described spaghetti code as having "too many possible configurations." The four-way translation problem (database ↔ API ↔ backend ↔ frontend) is exactly this entropy crisis:
- High entropy , need to track state across four independent representations
- Tight coupling , change in one layer ripples through all four, requiring synchronized updates
- Coordination overhead , every change demands translation code, validation, and keeping representations coherent
Datom.world's solution: one canonical representation as immutable streams. The database is the API is the backend state is the frontend model, just different interpretations of the same stream. This collapses four sources of entropy into one, radically reducing system complexity.
That collapse changes how systems evolve: adaptation happens by changing interpreters and queries, not by synchronizing four drifting representations.
Queryable Evolution: Understanding Emergence as Data
The evolutionary workflow described above assumes you can refactor freely because tests provide safety. But what if you need to understand how the code evolved to its current state? Traditional systems require git archaeology, reading commit messages, diffing files, reconstructing mental models.
Datom.world makes evolution first-class data:
- Time-travel via streams , query "what was the dependency graph at transaction T?" directly
- Datalog over code , ask "which functions changed together?" to discover cohesion clusters empirically
- Replayable evolution , understanding emergence isn't archaeology, it's running a query
This principle, "observe cohesion, extract clusters", becomes mechanized. You don't manually notice which functions change together, you query the stream history and discover it.
Schema-on-Read: Decoupling Physics from Semantics
Our fitness function is: "how cheaply can you change without breaking promises?"
Datom.world's schema-on-read takes this to its limit by decoupling the physical substrate from the semantic interpretation.
- No physical contracts , the datom shape
[e a v t m]is the only physical invariant. There are no table schemas or fixed records to coordinate. Adding new data costs zero - Semantic contracts via queries , the "promise" in datom.world is the query interface. You can reinterpret old datoms in new ways without migrating the database
- Additive evolution , existing interpreters continue working while new ones extract new meaning from the same stream. Functionality grows without coordination
This is evolutionary adaptation in data form: fitness improves by adding better interpretations over time, not by rewriting historical facts.
Traditional databases force you to migrate the physical world forward (ALTER TABLE). Datom.world lets you reinterpret the past without rewriting it.
This is evolution through interpretation: your analytics interpreter from 2024 still works in 2025, extracting the metrics it always did. But a new ML interpreter can now extract training features from the same historical stream that the analytics interpreter is reading. Both coexist, both work, neither requires the other to change. Functionality grows without physical coordination. Here, the primary objective, malleability, extends beyond "can we change without breaking tests?" to "can we extract new meaning from old data without touching it?"
The Meta-Point: Principles as Infrastructure
Most systems treat entropy management as a developer discipline, write clean code, follow SOLID principles, refactor diligently. Datom.world treats it as substrate design:
- Entropy distribution isn't a guideline, it's how streams work globally
- Loose coupling isn't a code review checklist, it's what continuations and interpretations enforce
- Single source of truth isn't an architecture decision, it's the only option
- Malleability isn't just an aspirational quality metric, it's the built-in operational property of schema-on-read
We still apply the evolutionary workflow to our own codebase: start simple, let structure emerge through REPL exploration, test interfaces not implementations, preserve knowledge in comment blocks, refactor toward malleability. But the platform itself is built to make these practices natural rather than effortful. The system resists decay because the architecture wants to push complexity outward and keep internal order high.
This is the deeper point: good software principles shouldn't be things developers remember to do. They should be things the infrastructure makes easy.
And here's the full-circle irony: datom.world itself is a self-funded solo developer project. The economic selection pressure we described earlier, where under-funded projects are forced to maintain low module entropy to survive, applies directly to this codebase. These principles aren't just philosophical for datom.world, they're survival requirements. Without the luxury of a large team or abundant funding to sustain code rot, the architecture must keep internal complexity bounded by following these principles. Datom.world exists because it practices what it preaches.
Conclusion: Code That Breathes
Good code isn't a static artifact. It's a living system that evolves in response to changing requirements. The measure of goodness is malleability, can it adapt with low cost?
High cohesion and loose coupling provide the structure. Entropy guides the distribution. Three selection pressures, automated tests, developer cognitive load, and economic constraints, kill unfit code, forcing evolution toward malleability. Interface tests guard promises, not implementations. Comment blocks preserve ephemeral knowledge, valuable while true, deleted when they lie. Dependency graphs reveal quality at a glance, layered structure vs. tangled spaghetti, visible without understanding the code.
The result is code that breathes: expanding when needed, contracting when simplified, always adapting to the environment around it.
This is software as evolution, not intelligent design. And evolution, given enough time and the right pressures, produces remarkably resilient systems.
Further reading:
- Kent Beck: Software Design is an Exercise in Human Relationships , InfoQ , he spent 17 years learning how to explain cohesion in software design
- Simple Made Easy , Rich Hickey's foundational talk on simplicity and complexity
- A New Thermodynamics Theory of the Origin of Life , Quanta Magazine , on how systems maintain low entropy by increasing entropy in their environment
- What is Good Code? , the statistical mechanics argument and dependency graphs