Beyond LSP: Queryable AST as the Universal Language Server

The LSP Model: Text‑Based RPC

The Language Server Protocol, introduced by Microsoft in 2016, brought a welcome simplification: a single protocol for many IDEs to talk to many languages. Instead of each IDE writing a Python plugin, a Java plugin, a C++ plugin, they could all speak LSP to language‑specific servers.

LSP is a JSON‑RPC protocol. The IDE sends requests like:

{"method": "textDocument/definition",
 "params": {
   "textDocument": {"uri": "file:///src/main.py"},
   "position": {"line": 10, "character": 5}
 },
 "id": 1}

The language server parses the file, finds the definition, and replies with a line/column position in another file. The IDE then jumps there.

This works. It's better than the pre‑LSP chaos. But LSP has fundamental limitations that stem from its core assumptions:

  • Code is text : operations are line/column positions, not semantic nodes
  • Per‑language servers : each language needs its own parser, its own index, its own cache
  • Request‑response : features are RPC calls, not queries over a shared database
  • No cross‑language awareness : a Python server doesn't know about the Java code calling it
  • No history : LSP sees the current snapshot, not how the code evolved

LSP treats the symptom (many IDE‑language combinations) but not the disease: code is not text.

Yin.vm's Inversion: AST as Canonical, Text as View

As explored in When the IDE Edits AST, Not Text, Yin.vm makes the Universal AST the canonical representation of code. Text is just a rendering: a view. The AST is stored as datom streams in a Datalog database (DaoDB).

This changes everything. Instead of an IDE asking a language server "where is the definition?", the IDE runs a Datalog query:

[:find ?def-node
 :where
 [?ref-node :ast/type :variable-reference]
 [?ref-node :ast/name "calculateTotal"]
 [?def-node :ast/type :function-definition]
 [?def-node :ast/name "calculateTotal"]]

The query returns the AST node ID of the definition. The IDE can then render that node in any syntax (Python, Java, Clojure) or jump directly to its location in the source view.

This is not a faster LSP. It's a different architectural layer:

AspectLSPYin.vm AST Database
Data ModelText files (lines/columns)Datom streams (AST nodes)
ProtocolJSON‑RPC (request‑response)Datalog queries (declarative)
ServerPer‑language (Python, Java, …)Single DaoDB (all languages)
Cross‑LanguageNone (separate servers)Native (same AST representation)
HistoryNone (current snapshot only)Full versioning (every edit is a transaction)
CollaborationNone (per‑editor state)Built‑in (datom streams merge)
ToolingLanguage‑specific pluginsUniversal queries (work on any language)

From RPC Calls to Datalog Queries

Let's translate common IDE features from LSP requests to Datalog queries:

1. Go to Definition

LSP: textDocument/definition with line/column.

Yin.vm:

;; Find the function definition a variable references
[:find ?def
 :where
 [?ref :ast/type :variable-reference]
 [?ref :ast/name "calculateTotal"]
 [?def :ast/type :function-definition]
 [?def :ast/name "calculateTotal"]]

2. Find All References

LSP: textDocument/references.

Yin.vm:

;; Find all variable references to this function
[:find ?ref
 :where
 [?def :ast/type :function-definition]
 [?def :ast/name "calculateTotal"]
 [?ref :ast/type :variable-reference]
 [?ref :ast/name "calculateTotal"]]

3. Rename Symbol

LSP: textDocument/rename, returns text edits.

Yin.vm: A transaction that updates all relevant datoms:

;; Find all nodes with the old name
[:find ?node
 :where
 [?node :ast/name "oldName"]]

;; Update them (simplified)
(doseq [node nodes]
  (transact! [node :ast/name "newName"]))

Because the AST is canonical, renaming is semantic, not textual. A variable user is different from a string literal "user". The query knows the difference.

4. Hover (Show Type Information)

LSP: textDocument/hover, returns markdown.

Yin.vm: Query type certainty metadata on the AST node:

[:find ?type ?certainty
 :where
 [?node :ast/type :variable-reference]
 [?node :ast/name "total"]
 [?node :ast/type-annotation ?type]
 [?node :ast/type-certainty ?certainty]]

The type system is unified as certainty levels (:static, :dynamic, :unknown), as explained in Yin.vm: Chinese Characters for Programming Languages.

5. Code Lens (References Count)

LSP: textDocument/codeLens.

Yin.vm: Just count the results of the references query.

[:find (count ?ref)
 :where
 [?def :ast/type :function-definition]
 [?def :ast/name "calculateTotal"]
 [?ref :ast/type :variable-reference]
 [?ref :ast/name "calculateTotal"]]

Every IDE feature becomes a query over the AST database. No language‑specific plugins. No RPC overhead.

Cross‑Language Tooling: The Killer Feature

LSP servers are siloed. A Python server knows nothing about Java. A Java server knows nothing about C++. This mirrors how we write software: in separate language silos, with fragile glue code.

Yin.vm's Universal AST breaks these siloes. Because all languages compile to the same AST representation, you can write queries that span languages:

;; Find all functions that read from a network socket
;; and write to a file, regardless of language
[:find ?fn ?lang
 :where
 [?fn :ast/type :function-definition]
 [?fn :ast/language ?lang]
 [?fn :ast/body ?body]
 [?body :ast/contains ?socket-read]
 [?socket-read :ast/type :socket-read]
 [?body :ast/contains ?file-write]
 [?file-write :ast/type :file-write]]

Or refactor across language boundaries:

;; Replace deprecated API calls in Python, Java, and C++
[:find ?call ?file ?lang
 :where
 [?call :ast/type :function-call]
 [?call :ast/function-name "oldDeprecatedAPI"]
 [?call :ast/location ?file]
 [?file :file/language ?lang]]

;; Update all to "newRecommendedAPI"
(doseq [call calls]
  (transact! [call :ast/function-name "newRecommendedAPI"]))

This is impossible with LSP. Each language server owns its own index. There's no shared database to query across languages.

Time‑Travel and Collaborative Editing Built‑In

LSP sees only the current snapshot. Yin.vm's datom streams preserve every edit as a transaction. This enables features LSP can't touch:

1. Time‑Travel Debugging for Code

Query the AST as of any transaction:

;; What did this function look like 3 hours ago?
[:find ?node ?name ?body
 :where
 [?node :ast/type :function-definition]
 [?node :ast/name "calculateTotal"]
 [?node :ast/body ?body]
 :at-tx 1500]   ;; transaction ID 1500

2. Semantic Diff

Not just which lines changed, but which semantic structures changed:

;; What functions changed between tx 1000 and 2000?
[:find ?fn ?old-body ?new-body
 :where
 [?fn :ast/type :function-definition]
 [?fn :ast/body ?old-body :at-tx 1000]
 [?fn :ast/body ?new-body :at-tx 2000]
 [(not= ?old-body ?new-body)]]

3. Collaborative Editing as Datom Merging

Two developers edit the same function. With LSP, you'd need operational transforms or CRDTs at the text level. With Yin.vm, edits are datom transactions that merge semantically:

;; Developer A adds a parameter
[fn-1 :ast/params [param-1 param-2 param-3] tx-1001]

;; Developer B renames the function
[fn-1 :ast/name "calculateTotal" tx-1002]

;; Both transactions succeed: different parts of the AST

Conflict resolution happens at the semantic level, not the text level. As shown in the earlier blog post, this makes collaboration natural.

No More Language Servers

Today, each language needs:

  • A parser (often several, for different IDE features)
  • An index (in‑memory or on‑disk)
  • A cache (for performance)
  • A language‑specific plugin (for edge cases)

With Yin.vm, there's one parser per language (to convert source text to Universal AST), but one database for all languages. The parser's output is datoms, which go into DaoDB. From there, every IDE feature works the same way, regardless of language.

This reduces complexity dramatically:

ComponentLSP WorldYin.vm World
ParserPer language, often multiple (syntax highlighting, AST, refactoring)One per language, outputting Universal AST datoms
IndexPer language server, in‑memory, rebuilt on changeSingle DaoDB with persistent EAVT/AEVT/AVET/VAET indexes
Query EngineLanguage‑specific heuristicsDatalog (same for all languages)
CacheMemory‑resident, lost on restartPersistent datom store, always available
Cross‑LanguageNone (separate servers)Native (same database)
VersioningNone (current snapshot)Built‑in (transaction log)
CollaborationNone (per‑editor)Built‑in (stream merging)

The language server disappears. In its place: a database of AST datoms and a query engine.

The IDE as Database Client

With Yin.vm, the IDE doesn't talk to a language server. It talks to DaoDB. It can:

  • Subscribe to datom streams (live updates as code changes)
  • Run Datalog queries (for navigation, refactoring, analysis)
  • Submit transactions (for edits)
  • Open historical views (time‑travel)

This architecture enables entirely new kinds of IDE features:

Semantic Search

Not "find string 'foo'", but "find functions that take a network socket and return a sorted list".

Automated Refactoring Across Languages

Convert Python dictionaries to Java classes, updating all call sites in both languages.

Live Type Migration

Change a type from dynamic to static and see all affected code across the entire polyglot codebase.

Architecture Enforcement

Define rules like "UI components must not import database modules" and query for violations in real‑time.

These aren't fantasies. They're straightforward Datalog queries over the AST database.

What About Performance?

LSP servers are notorious for memory usage and sluggishness. They parse entire codebases on startup and keep ASTs in memory. Yin.vm's DaoDB is a persistent, indexed database. It doesn't need to parse on startup: the datoms are already there, indexed for fast queries.

DaoDB's EAVT/AEVT/AVET/VAET covering indexes make queries like "find all references" O(log n), not O(n). The database is updated incrementally as datoms arrive, not rebuilt from scratch.

For massive codebases, you can shard by project, module, or namespace. Datalog queries can run distributed. This is already how DaoDB scales for data; the same architecture works for code.

The Universal Language Server Is a Database

LSP's great insight was standardization: one protocol for many IDEs and languages. Yin.vm's insight is deeper: the language server shouldn't be a server at all. It should be a database.

A database that:

  • Stores code as semantic ASTs, not text
  • Indexes everything for fast queries
  • Preserves full history automatically
  • Merges collaborative edits naturally
  • Works across all languages in the same way

The IDE becomes a database client. Language servers disappear. Language‑specific tooling disappears. What remains is a universal query interface over the semantic structure of code.

This is the next step after LSP. Not a better protocol for talking to language servers, but eliminating the need for language servers altogether.

Getting There

The path from today's LSP‑based world to Yin.vm's query‑based world requires:

  1. Universal AST adoption : languages need parsers to Yin.vm's AST (like the Yang compiler for Clojure)
  2. IDE integration : IDEs need to speak Datalog to DaoDB instead of JSON‑RPC to LSP
  3. Incremental migration : tools that convert existing codebases to AST datoms

But the payoff is immense: tooling that works across languages, understands code semantically, and preserves history automatically.

LSP showed us the power of standardization. Yin.vm shows us the power of semantic representation. When code is stored as queryable AST datoms, the language server protocol becomes obsolete.

Learn more: