# Findings — Logseq: block-graph semantics on plain Markdown files Date: 2026-06-14 Source kind: **modern shipped product** — an open-source, local-first outliner; a *candidate shard* occupying the point none of the prior tools do: **block-graph + file- over-app** Lens: shard-wiki — the convergence of block-level addressing with git-diffable files, a derived query index over canonical files, and a live file→DB storage migration > Why Logseq matters distinctly. The modern-tool dives staked out the corners: **Roam** > = block-graph but client-DB / in-app-API / store-minted UUID; **Obsidian** = > file-over-app but page-level / path-addressed (block `^id` opt-in); **Notion** = > block-graph but closed hosted DB / external-API. **Logseq sits in the middle they > leave empty: block-graph *semantics* (UUID-addressable, embeddable, queryable blocks) > stored as *plain Markdown/Org files on disk you own*, with a DataScript index > **derived** from those files.** It is the system that proves you do not have to choose > between Roam's fine-grained addressing and Obsidian's file sovereignty — you can have > block-level addressing *that is also git-diffable text*. That resolves a tension the > synthesis flagged (addressing spectrum) and is the reason it earns a memo rather than > a footnote. Contrast set: Roam (block-DB, in-app), Obsidian (file, page-level), Notion (hosted DB), Joplin (SQLite-local, files-on-sync). Logseq = **block-graph, files-canonical, index-derived** — and now *also* migrating toward SQLite (§5), a live UC-43 case. --- ## 1. Core architecture — files canonical, DataScript derived - **Storage:** plain-text **Markdown or Org-mode files on disk** — `pages/.md` and `journals/.md` — fully usable even if the app disappears (local-first, open source, ClojureScript). - **Index:** a **DataScript** in-memory graph database, **derived** from the files: an `mldoc` parser turns Markdown into an AST, extracts references/properties, and transforms it into **DataScript entities** with tree-position attributes. Files are the source of truth; the graph DB is a rebuildable projection. (In current versions a **DB Worker** manages **both DataScript and SQLite** connections — see §5.) - This is the **files-canonical / index-derived** architecture (Obsidian's MetadataCache, Git's working tree) — but here the derived index is a *full Datalog-queryable graph*, not just a metadata cache. Logseq is the strongest evidence that **a file-backed shard can support rich structured queries via a derived index** (UC-52, UC-63). --- ## 2. The block-graph, in the file text Everything is an **atomic block** (a bullet), individually referenceable, embeddable, and queryable — Roam's model — but the addressing lives **in the Markdown**: - **Block IDs are in-file properties.** When a block is referenced, Logseq writes an `id:: ` property line into the Markdown. So the block's stable address is **git-diffable text that survives a file copy**, not a DB-minted hidden key. This is the **sweet spot of the addressing spectrum**: block-level (like Roam) *and* in-file/ portable (like Obsidian's `^id`), without choosing. - **References:** `[[page]]`, **`((block-uuid))`**, `#tag` — all extracted into the graph; block embeds `{{embed ((uuid))}}` are transclusion at block granularity (UC-32). - **Properties:** `key:: value` lines attach typed metadata to blocks (block properties) and pages (first-block/page properties) — **git-diffable structured data at block granularity** (UC-34), queried by Datalog. - **Outline tree:** a page is a **nested tree of blocks** (indent = structure), with outliner operations — indent/outdent, move subtree, **zoom/narrow** into a block. The page is not flat prose; it is an addressable, reorderable block tree (UC-63). --- ## 3. Queries — Datalog over the derived graph Logseq exposes **`{{query}}`** (simple) and **advanced Datalog queries** over the DataScript graph (and via the plugin API, `logseq.DB.datascriptQuery`). Because the graph is derived from files, **query-defined pages** (UC-54) and structured aggregation (tasks, tagged blocks) run over a *file-backed* store. Key lesson: query capability here is **neither native-DB (Notion) nor a third-party plugin (Obsidian Dataview)** — it is **built into the tool over its own files**, demonstrating that *a derived query index is a viable adapter/orchestrator capability for file shards* (UC-63). --- ## 4. Extension surface — plugin API + marketplace Logseq has a JS **plugin API** (marketplace ~486 plugins, `logseq/marketplace`): - `logseq.Editor` — block/page CRUD, insert/move/update/delete, get current block/page. - `logseq.DB` — **`datascriptQuery`** (Datalog), plus DB change subscriptions. - `logseq.App` — app-level ops/events; `logseq.UI`; `logseq.Git` (git ops on the graph); `logseq.Assets`; `logseq.FileStorage`. - Slash commands, block/page context menus, `provideModel`/`provideStyle`, hooks/events. So, like Obsidian, Logseq is **dual-attachable**: (a) **file-store direct** — read the `pages/`+`journals/` Markdown with a **Logseq format profile** (parse `id::`, `((uuid))`, `key::`, the outline tree); (b) **in-app plugin host** — `logseq.Editor` write + `logseq.DB` query + change events for live write-through. A notable extra: a built-in **`logseq.Git`** surface — the tool treats git as a first-class companion to the file graph (validating the coordination journal). --- ## 5. The file→DB migration — a live UC-43 Logseq is **migrating its storage model from Markdown-files to a SQLite "DB graph"** (the DB Worker already manages SQLite alongside DataScript; the plugin API has a distinct "DB graph" mode with tags/classes/typed properties). This is a real-world instance of **UC-43 (backend-swap under stable identity)**: the *same tool and graph identity* moving from a file substrate to a DB substrate, trading git-diffability for richer typed structure (toward the Notion/XWiki end). For shard-wiki it is both a caution (a shard's substrate can change under it) and a confirmation that the **addressing/structure/history spectra** are trajectories tools actually travel — an adapter keyed to a fixed substrate will break. --- ## 6. Logseq as a shard — capability profile | Capability | Logseq (MD-file graph) | Notes for the adapter contract | |------------|------------------------|--------------------------------| | Read | **yes** | file-store direct (pages/journals MD) or `logseq.Editor`/`logseq.DB` in-app | | Write | **yes** | direct file write (format-aware) or plugin; block-level | | Write granularity | **per-block** (in a per-file page) | finer than Obsidian (per-file), like Roam — but on files | | Identity / addressing | **in-file block `id:: uuid` + `((uuid))`** | block-level **and** git-diffable — the addressing sweet spot (UC-51) | | Structure | `key:: value` block/page properties; outline tree | git-diffable structured data at block granularity (UC-34) | | History | none native; **`logseq.Git`** + files = git-native | git-friendly out of the box (adopt, not supplement) | | Native query | **Datalog over derived DataScript** | derived index over files → delegate or rebuild (UC-52, UC-63) | | Transclusion | **block embeds `{{embed ((uuid))}}`** | in-file, addressable (UC-32) | | Backlinks | linked + unlinked references | derived (UC-05/18) | | Content types | Markdown/Org + **whiteboards** (tldraw JSON) | non-Markdown spatial content (UC-55) | | Substrate | **files now, SQLite "DB graph" emerging** | live backend-swap (UC-43) | | Attach modes | file-store direct (format profile) · in-app plugin | dual, per-binding (UC-62) | Verdict: **the most shard-wiki-friendly block tool** — block-graph power with file-over- app sovereignty and git-diffable addressing/structure. Best attach: **file-store direct with a Logseq format profile** (offline, git-native), with the plugin for live write- through. Watch the **DB-graph migration** (UC-43). --- ## 7. Mapping to shard-wiki INTENT (compare, do not equate) ### 7.1 Reinforcements - **Resolves the addressing tension.** Block-level addressing *can* be git-diffable in-file (`id::`), not forced to choose Roam-DB vs Obsidian-page (UC-51, synthesis §2). - **Confirms files-canonical / index-derived at full power** — a Datalog graph derived from files, not just a metadata cache (UC-52, UC-63). - **Structure as git-diffable text** (`key::`) reinforces "prefer in-text structure" (UC-34, synthesis through-line). - **Outliner block tree** validates the page-as-addressable-block-tree demand (UC-50, UC-63) on a *file* backend. ### 7.2 Deliberate divergences (design bugs if conflated) 1. **One graph = one shard.** Local-first single graph; not the federation layer. 2. **The MD files carry Logseq-specific syntax** (`id::`, `((uuid))`, `key::`, outline semantics) — a **format profile**, not plain CommonMark. A naïve Markdown reader will mangle block IDs/properties (cf. UC-42; lighter than Notion's lossy case). 3. **The substrate is moving (file→SQLite).** Don't hard-code the file model; gate on the substrate capability and tolerate the swap (UC-43). 4. **Whiteboards are not Markdown** — typed/opaque assets, not flattened (UC-55). ### 7.3 What Logseq teaches that shard-wiki should keep - **In-file block addressing is the target shape** for a portable span address where the backend cooperates — adopt `id::`-style schemes; they are git-diffable and survive copy. - **A derived query index over files is a first-class capability** — shard-wiki can build one over a file shard (UC-63) when the backend exposes none, or delegate when it does. - **Expect substrate migration** — bind to capabilities, not to "it's files." --- ## 8. Use-case seeds → catalog (promoted 2026-06-14) Last existing UC is **UC-61**. New UCs **UC-62, UC-63** added; existing UCs enriched. | Seed | Catalog action | |------|----------------| | **Attach a block-graph-on-plain-files shard** (Logseq-style): file-over-app Markdown carrying in-file block IDs (`id::`), block refs, and `key::` properties, with the outline tree and a derivable query index | **UC-62 (new)** | | **Serve structured queries over a file-backed shard via a derived index** the orchestrator/adapter builds (Logseq DataScript-over-files) when the backend exposes none | **UC-63 (new)** | | In-file block `id::` = block-level **and** git-diffable addressing (the spectrum sweet spot) | **enriches UC-51** | | Live **file→SQLite substrate migration** under stable graph identity | **enriches UC-43** | | Block-graph that is **files, not a DB** — the file-store variant of the block tool family | **enriches UC-50** | | Datalog over a **derived** index built from files | **enriches UC-52** | | `key:: value` block/page properties in-text | **enriches UC-34** | | Whiteboards (tldraw JSON) = non-Markdown content | **enriches UC-55** | | Block embeds `{{embed ((uuid))}}` = in-file transclusion | links UC-32 | --- ## 9. Architecture notes for SHARD-WP-0002 (no UC) - **Format-aware file-store profiles** (already flagged by Joplin, UC-60) gain a strong case here: a Logseq profile parses `id::`/`((uuid))`/`key::`/outline from otherwise- plain Markdown. The contract should let a file-store adapter declare its **format profile** (plain MD / Obsidian / Logseq / Joplin-item / Foswiki-PlainFile). (T11/T14.) - **Derived-query-index as an orchestrator capability** (UC-63): when a file shard has no native query engine, build a DataScript-like index over the projection; when it does (Roam/Notion/XWiki), delegate (UC-52). Both belong in T16's navigation layer + T11. - **Substrate-migration tolerance** (UC-43, Logseq file→SQLite): T14 binding should treat substrate as a capability that can change under a live attachment, preserving identity. - **In-file block addressing** is the concrete realization of T16's span-address thread — prefer `id::`-style git-diffable IDs over DB-minted where the backend allows. --- ## 10. Open questions (for spec / workplans) 1. When the same tool offers **file** and **DB** substrates (Logseq now), does shard-wiki prefer the file graph (git-diffable, UC-62) or the DB graph (richer typed structure), and can one binding follow the migration (UC-43)? 2. Is the **derived query index** (UC-63) built by the **adapter** (per-shard) or the **core orchestrator** (over the union), and is it persisted or rebuilt? 3. How much **Logseq outline semantics** (zoom, subtree move) must shard-wiki preserve vs. present as a flat page (UC-63 vs. Markdown-first page model)? 4. Does the **Logseq format profile** round-trip overlays (write `id::`/`key::` back) or only read them? (cf. UC-42 round-trip question.) --- ## 11. Sources | Source | Used for | |--------|----------| | DeepWiki — logseq/logseq (https://deepwiki.com/logseq/logseq) | DataScript+Datalog graph, mldoc AST parse, block entities/tree, DB Worker managing DataScript **and** SQLite | | logseq/docs (https://deepwiki.com/logseq/docs) | Plain-text MD/Org files; pages/journals; references ([[ ]], (( )), #tag); properties | | Logseq Plugin API docs (https://plugins-doc.logseq.com/) | `logseq.App/Editor/DB/Git/UI/Assets/FileStorage`; `DB.datascriptQuery` | | logseq/marketplace (https://github.com/logseq/marketplace) | Plugin distribution; ~486 plugins | | kerim/logseq-db-plugin-api-skill (https://github.com/kerim/logseq-db-plugin-api-skill) | DB-graph version: tags/classes, typed properties, file→DB migration | | pangea.app glossary — Logseq (https://pangea.app/glossary/logseq) | Local-first framing, outliner, plain-text control | Cross-references: `research/260614-roam-deep-dive/findings.md` (block-DB contrast), `research/260614-obsidian-deep-dive/findings.md` (file-over-app, derived index), `research/260614-joplin-deep-dive/findings.md` (format-aware file-store profiles), `research/260614-shard-spectrum-synthesis/findings.md` (addressing spectrum this resolves), `spec/UseCaseCatalog.md` (UC-32, UC-34, UC-43, UC-50, UC-51, UC-52, UC-55), `workplans/SHARD-WP-0002-federation-architecture.md` (T11, T14, T16). --- ## 12. Traceability - New UCs: **UC-62, UC-63** → `spec/UseCaseCatalog.md`. - Enriched UCs: **UC-32, UC-34, UC-43, UC-50, UC-51, UC-52, UC-55**. - Architecture (no UC): format-aware file-store profiles (Logseq profile); derived-query- index capability; substrate-migration tolerance; in-file block addressing as the T16 span-address target → `SHARD-WP-0002` (T11, T14, T16). - Boundary recorded: Logseq is **one block-graph-on-files candidate shard** (the addressing sweet spot), best attached file-store-direct with a format profile; not the federation layer; substrate is migrating file→SQLite (UC-43).