generated from coulomb/repo-seed
Occupies the point the other modern tools leave empty: block-graph semantics (UUID-addressable, embeddable, queryable blocks) stored as plain Markdown/Org files on disk, with a DataScript graph derived from the files (files canonical, index derived). The bridge between Roam (block-DB) and Obsidian (file-over-app). Headline finding: Logseq resolves the addressing-spectrum tension — block-level addressing that is also git-diffable in-file text (id:: property) — and proves a file-backed shard can serve rich Datalog queries via a derived index. Also: file->SQLite "DB graph" migration is a live UC-43 (substrate swap under stable identity); whiteboards = non-Markdown content; dual-attachable (file-store direct with a Logseq format profile, or in-app plugin). Added UC-62 (attach block-graph-on-plain-files shard), UC-63 (serve structured queries over a file shard via a derived index shard-wiki builds — converse of UC-52); enriched UC-32/34/43/50/51/52/55. Catalog now 63 UCs. Architecture for SHARD-WP-0002 T11/T14/T16: Logseq format profile, derived-query-index capability, substrate-migration tolerance, in-file block addressing as the T16 span-address target. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
247 lines
14 KiB
Markdown
247 lines
14 KiB
Markdown
# Findings — Logseq: block-graph semantics on plain Markdown files
|
|
|
|
Date: 2026-06-14
|
|
Source kind: **modern shipped product** — an open-source, local-first outliner; a
|
|
*candidate shard* occupying the point none of the prior tools do: **block-graph + file-
|
|
over-app**
|
|
Lens: shard-wiki — the convergence of block-level addressing with git-diffable files, a
|
|
derived query index over canonical files, and a live file→DB storage migration
|
|
|
|
> Why Logseq matters distinctly. The modern-tool dives staked out the corners: **Roam**
|
|
> = block-graph but client-DB / in-app-API / store-minted UUID; **Obsidian** =
|
|
> file-over-app but page-level / path-addressed (block `^id` opt-in); **Notion** =
|
|
> block-graph but closed hosted DB / external-API. **Logseq sits in the middle they
|
|
> leave empty: block-graph *semantics* (UUID-addressable, embeddable, queryable blocks)
|
|
> stored as *plain Markdown/Org files on disk you own*, with a DataScript index
|
|
> **derived** from those files.** It is the system that proves you do not have to choose
|
|
> between Roam's fine-grained addressing and Obsidian's file sovereignty — you can have
|
|
> block-level addressing *that is also git-diffable text*. That resolves a tension the
|
|
> synthesis flagged (addressing spectrum) and is the reason it earns a memo rather than
|
|
> a footnote.
|
|
|
|
Contrast set: Roam (block-DB, in-app), Obsidian (file, page-level), Notion (hosted DB),
|
|
Joplin (SQLite-local, files-on-sync). Logseq = **block-graph, files-canonical,
|
|
index-derived** — and now *also* migrating toward SQLite (§5), a live UC-43 case.
|
|
|
|
---
|
|
|
|
## 1. Core architecture — files canonical, DataScript derived
|
|
|
|
- **Storage:** plain-text **Markdown or Org-mode files on disk** — `pages/<Page>.md` and
|
|
`journals/<date>.md` — fully usable even if the app disappears (local-first, open
|
|
source, ClojureScript).
|
|
- **Index:** a **DataScript** in-memory graph database, **derived** from the files: an
|
|
`mldoc` parser turns Markdown into an AST, extracts references/properties, and
|
|
transforms it into **DataScript entities** with tree-position attributes. Files are the
|
|
source of truth; the graph DB is a rebuildable projection. (In current versions a **DB
|
|
Worker** manages **both DataScript and SQLite** connections — see §5.)
|
|
- This is the **files-canonical / index-derived** architecture (Obsidian's MetadataCache,
|
|
Git's working tree) — but here the derived index is a *full Datalog-queryable graph*,
|
|
not just a metadata cache. Logseq is the strongest evidence that **a file-backed shard
|
|
can support rich structured queries via a derived index** (UC-52, UC-63).
|
|
|
|
---
|
|
|
|
## 2. The block-graph, in the file text
|
|
|
|
Everything is an **atomic block** (a bullet), individually referenceable, embeddable, and
|
|
queryable — Roam's model — but the addressing lives **in the Markdown**:
|
|
|
|
- **Block IDs are in-file properties.** When a block is referenced, Logseq writes an
|
|
`id:: <uuid>` property line into the Markdown. So the block's stable address is
|
|
**git-diffable text that survives a file copy**, not a DB-minted hidden key. This is the
|
|
**sweet spot of the addressing spectrum**: block-level (like Roam) *and* in-file/
|
|
portable (like Obsidian's `^id`), without choosing.
|
|
- **References:** `[[page]]`, **`((block-uuid))`**, `#tag` — all extracted into the graph;
|
|
block embeds `{{embed ((uuid))}}` are transclusion at block granularity (UC-32).
|
|
- **Properties:** `key:: value` lines attach typed metadata to blocks (block properties)
|
|
and pages (first-block/page properties) — **git-diffable structured data at block
|
|
granularity** (UC-34), queried by Datalog.
|
|
- **Outline tree:** a page is a **nested tree of blocks** (indent = structure), with
|
|
outliner operations — indent/outdent, move subtree, **zoom/narrow** into a block. The
|
|
page is not flat prose; it is an addressable, reorderable block tree (UC-63).
|
|
|
|
---
|
|
|
|
## 3. Queries — Datalog over the derived graph
|
|
|
|
Logseq exposes **`{{query}}`** (simple) and **advanced Datalog queries** over the
|
|
DataScript graph (and via the plugin API, `logseq.DB.datascriptQuery`). Because the graph
|
|
is derived from files, **query-defined pages** (UC-54) and structured aggregation (tasks,
|
|
tagged blocks) run over a *file-backed* store. Key lesson: query capability here is
|
|
**neither native-DB (Notion) nor a third-party plugin (Obsidian Dataview)** — it is
|
|
**built into the tool over its own files**, demonstrating that *a derived query index is a
|
|
viable adapter/orchestrator capability for file shards* (UC-63).
|
|
|
|
---
|
|
|
|
## 4. Extension surface — plugin API + marketplace
|
|
|
|
Logseq has a JS **plugin API** (marketplace ~486 plugins, `logseq/marketplace`):
|
|
|
|
- `logseq.Editor` — block/page CRUD, insert/move/update/delete, get current block/page.
|
|
- `logseq.DB` — **`datascriptQuery`** (Datalog), plus DB change subscriptions.
|
|
- `logseq.App` — app-level ops/events; `logseq.UI`; `logseq.Git` (git ops on the graph);
|
|
`logseq.Assets`; `logseq.FileStorage`.
|
|
- Slash commands, block/page context menus, `provideModel`/`provideStyle`, hooks/events.
|
|
|
|
So, like Obsidian, Logseq is **dual-attachable**: (a) **file-store direct** — read the
|
|
`pages/`+`journals/` Markdown with a **Logseq format profile** (parse `id::`, `((uuid))`,
|
|
`key::`, the outline tree); (b) **in-app plugin host** — `logseq.Editor` write +
|
|
`logseq.DB` query + change events for live write-through. A notable extra: a built-in
|
|
**`logseq.Git`** surface — the tool treats git as a first-class companion to the file
|
|
graph (validating the coordination journal).
|
|
|
|
---
|
|
|
|
## 5. The file→DB migration — a live UC-43
|
|
|
|
Logseq is **migrating its storage model from Markdown-files to a SQLite "DB graph"**
|
|
(the DB Worker already manages SQLite alongside DataScript; the plugin API has a distinct
|
|
"DB graph" mode with tags/classes/typed properties). This is a real-world instance of
|
|
**UC-43 (backend-swap under stable identity)**: the *same tool and graph identity* moving
|
|
from a file substrate to a DB substrate, trading git-diffability for richer typed
|
|
structure (toward the Notion/XWiki end). For shard-wiki it is both a caution (a shard's
|
|
substrate can change under it) and a confirmation that the **addressing/structure/history
|
|
spectra** are trajectories tools actually travel — an adapter keyed to a fixed substrate
|
|
will break.
|
|
|
|
---
|
|
|
|
## 6. Logseq as a shard — capability profile
|
|
|
|
| Capability | Logseq (MD-file graph) | Notes for the adapter contract |
|
|
|------------|------------------------|--------------------------------|
|
|
| Read | **yes** | file-store direct (pages/journals MD) or `logseq.Editor`/`logseq.DB` in-app |
|
|
| Write | **yes** | direct file write (format-aware) or plugin; block-level |
|
|
| Write granularity | **per-block** (in a per-file page) | finer than Obsidian (per-file), like Roam — but on files |
|
|
| Identity / addressing | **in-file block `id:: uuid` + `((uuid))`** | block-level **and** git-diffable — the addressing sweet spot (UC-51) |
|
|
| Structure | `key:: value` block/page properties; outline tree | git-diffable structured data at block granularity (UC-34) |
|
|
| History | none native; **`logseq.Git`** + files = git-native | git-friendly out of the box (adopt, not supplement) |
|
|
| Native query | **Datalog over derived DataScript** | derived index over files → delegate or rebuild (UC-52, UC-63) |
|
|
| Transclusion | **block embeds `{{embed ((uuid))}}`** | in-file, addressable (UC-32) |
|
|
| Backlinks | linked + unlinked references | derived (UC-05/18) |
|
|
| Content types | Markdown/Org + **whiteboards** (tldraw JSON) | non-Markdown spatial content (UC-55) |
|
|
| Substrate | **files now, SQLite "DB graph" emerging** | live backend-swap (UC-43) |
|
|
| Attach modes | file-store direct (format profile) · in-app plugin | dual, per-binding (UC-62) |
|
|
|
|
Verdict: **the most shard-wiki-friendly block tool** — block-graph power with file-over-
|
|
app sovereignty and git-diffable addressing/structure. Best attach: **file-store direct
|
|
with a Logseq format profile** (offline, git-native), with the plugin for live write-
|
|
through. Watch the **DB-graph migration** (UC-43).
|
|
|
|
---
|
|
|
|
## 7. Mapping to shard-wiki INTENT (compare, do not equate)
|
|
|
|
### 7.1 Reinforcements
|
|
|
|
- **Resolves the addressing tension.** Block-level addressing *can* be git-diffable
|
|
in-file (`id::`), not forced to choose Roam-DB vs Obsidian-page (UC-51, synthesis §2).
|
|
- **Confirms files-canonical / index-derived at full power** — a Datalog graph derived
|
|
from files, not just a metadata cache (UC-52, UC-63).
|
|
- **Structure as git-diffable text** (`key::`) reinforces "prefer in-text structure"
|
|
(UC-34, synthesis through-line).
|
|
- **Outliner block tree** validates the page-as-addressable-block-tree demand (UC-50,
|
|
UC-63) on a *file* backend.
|
|
|
|
### 7.2 Deliberate divergences (design bugs if conflated)
|
|
|
|
1. **One graph = one shard.** Local-first single graph; not the federation layer.
|
|
2. **The MD files carry Logseq-specific syntax** (`id::`, `((uuid))`, `key::`, outline
|
|
semantics) — a **format profile**, not plain CommonMark. A naïve Markdown reader will
|
|
mangle block IDs/properties (cf. UC-42; lighter than Notion's lossy case).
|
|
3. **The substrate is moving (file→SQLite).** Don't hard-code the file model; gate on the
|
|
substrate capability and tolerate the swap (UC-43).
|
|
4. **Whiteboards are not Markdown** — typed/opaque assets, not flattened (UC-55).
|
|
|
|
### 7.3 What Logseq teaches that shard-wiki should keep
|
|
|
|
- **In-file block addressing is the target shape** for a portable span address where the
|
|
backend cooperates — adopt `id::`-style schemes; they are git-diffable and survive copy.
|
|
- **A derived query index over files is a first-class capability** — shard-wiki can build
|
|
one over a file shard (UC-63) when the backend exposes none, or delegate when it does.
|
|
- **Expect substrate migration** — bind to capabilities, not to "it's files."
|
|
|
|
---
|
|
|
|
## 8. Use-case seeds → catalog (promoted 2026-06-14)
|
|
|
|
Last existing UC is **UC-61**. New UCs **UC-62, UC-63** added; existing UCs enriched.
|
|
|
|
| Seed | Catalog action |
|
|
|------|----------------|
|
|
| **Attach a block-graph-on-plain-files shard** (Logseq-style): file-over-app Markdown carrying in-file block IDs (`id::`), block refs, and `key::` properties, with the outline tree and a derivable query index | **UC-62 (new)** |
|
|
| **Serve structured queries over a file-backed shard via a derived index** the orchestrator/adapter builds (Logseq DataScript-over-files) when the backend exposes none | **UC-63 (new)** |
|
|
| In-file block `id::` = block-level **and** git-diffable addressing (the spectrum sweet spot) | **enriches UC-51** |
|
|
| Live **file→SQLite substrate migration** under stable graph identity | **enriches UC-43** |
|
|
| Block-graph that is **files, not a DB** — the file-store variant of the block tool family | **enriches UC-50** |
|
|
| Datalog over a **derived** index built from files | **enriches UC-52** |
|
|
| `key:: value` block/page properties in-text | **enriches UC-34** |
|
|
| Whiteboards (tldraw JSON) = non-Markdown content | **enriches UC-55** |
|
|
| Block embeds `{{embed ((uuid))}}` = in-file transclusion | links UC-32 |
|
|
|
|
---
|
|
|
|
## 9. Architecture notes for SHARD-WP-0002 (no UC)
|
|
|
|
- **Format-aware file-store profiles** (already flagged by Joplin, UC-60) gain a strong
|
|
case here: a Logseq profile parses `id::`/`((uuid))`/`key::`/outline from otherwise-
|
|
plain Markdown. The contract should let a file-store adapter declare its **format
|
|
profile** (plain MD / Obsidian / Logseq / Joplin-item / Foswiki-PlainFile). (T11/T14.)
|
|
- **Derived-query-index as an orchestrator capability** (UC-63): when a file shard has no
|
|
native query engine, build a DataScript-like index over the projection; when it does
|
|
(Roam/Notion/XWiki), delegate (UC-52). Both belong in T16's navigation layer + T11.
|
|
- **Substrate-migration tolerance** (UC-43, Logseq file→SQLite): T14 binding should treat
|
|
substrate as a capability that can change under a live attachment, preserving identity.
|
|
- **In-file block addressing** is the concrete realization of T16's span-address thread —
|
|
prefer `id::`-style git-diffable IDs over DB-minted where the backend allows.
|
|
|
|
---
|
|
|
|
## 10. Open questions (for spec / workplans)
|
|
|
|
1. When the same tool offers **file** and **DB** substrates (Logseq now), does shard-wiki
|
|
prefer the file graph (git-diffable, UC-62) or the DB graph (richer typed structure),
|
|
and can one binding follow the migration (UC-43)?
|
|
2. Is the **derived query index** (UC-63) built by the **adapter** (per-shard) or the
|
|
**core orchestrator** (over the union), and is it persisted or rebuilt?
|
|
3. How much **Logseq outline semantics** (zoom, subtree move) must shard-wiki preserve vs.
|
|
present as a flat page (UC-63 vs. Markdown-first page model)?
|
|
4. Does the **Logseq format profile** round-trip overlays (write `id::`/`key::` back) or
|
|
only read them? (cf. UC-42 round-trip question.)
|
|
|
|
---
|
|
|
|
## 11. Sources
|
|
|
|
| Source | Used for |
|
|
|--------|----------|
|
|
| DeepWiki — logseq/logseq (https://deepwiki.com/logseq/logseq) | DataScript+Datalog graph, mldoc AST parse, block entities/tree, DB Worker managing DataScript **and** SQLite |
|
|
| logseq/docs (https://deepwiki.com/logseq/docs) | Plain-text MD/Org files; pages/journals; references ([[ ]], (( )), #tag); properties |
|
|
| Logseq Plugin API docs (https://plugins-doc.logseq.com/) | `logseq.App/Editor/DB/Git/UI/Assets/FileStorage`; `DB.datascriptQuery` |
|
|
| logseq/marketplace (https://github.com/logseq/marketplace) | Plugin distribution; ~486 plugins |
|
|
| kerim/logseq-db-plugin-api-skill (https://github.com/kerim/logseq-db-plugin-api-skill) | DB-graph version: tags/classes, typed properties, file→DB migration |
|
|
| pangea.app glossary — Logseq (https://pangea.app/glossary/logseq) | Local-first framing, outliner, plain-text control |
|
|
|
|
Cross-references: `research/260614-roam-deep-dive/findings.md` (block-DB contrast),
|
|
`research/260614-obsidian-deep-dive/findings.md` (file-over-app, derived index),
|
|
`research/260614-joplin-deep-dive/findings.md` (format-aware file-store profiles),
|
|
`research/260614-shard-spectrum-synthesis/findings.md` (addressing spectrum this
|
|
resolves), `spec/UseCaseCatalog.md` (UC-32, UC-34, UC-43, UC-50, UC-51, UC-52, UC-55),
|
|
`workplans/SHARD-WP-0002-federation-architecture.md` (T11, T14, T16).
|
|
|
|
---
|
|
|
|
## 12. Traceability
|
|
|
|
- New UCs: **UC-62, UC-63** → `spec/UseCaseCatalog.md`.
|
|
- Enriched UCs: **UC-32, UC-34, UC-43, UC-50, UC-51, UC-52, UC-55**.
|
|
- Architecture (no UC): format-aware file-store profiles (Logseq profile); derived-query-
|
|
index capability; substrate-migration tolerance; in-file block addressing as the T16
|
|
span-address target → `SHARD-WP-0002` (T11, T14, T16).
|
|
- Boundary recorded: Logseq is **one block-graph-on-files candidate shard** (the
|
|
addressing sweet spot), best attached file-store-direct with a format profile; not the
|
|
federation layer; substrate is migrating file→SQLite (UC-43).
|
|
</content>
|