generated from coulomb/repo-seed
Document semantic attractors concept
This commit is contained in:
@@ -104,6 +104,32 @@ edges are intentionally shortest and most elastic; deployment-to-repo edges are
|
||||
longer and looser so infrastructure placement does not collapse into the repo
|
||||
node.
|
||||
|
||||
## Semantic Attractor Modes
|
||||
|
||||
Semantic attractors are view-only topic poles that can pull graph entities
|
||||
toward conceptual neighborhoods in spring-based layouts. For repository maps,
|
||||
an operator might choose attractors such as `security`, `development`, and
|
||||
`operations`; Fabric can then score each repository's semantic closeness to
|
||||
those attractors from repo-owned `SCOPE.md` evidence and map the score to
|
||||
layout strength.
|
||||
|
||||
Attractors are not domain edges and do not change Fabric graph data. They may
|
||||
be materialized as synthetic display-only nodes and `semantic_attraction`
|
||||
edges, or carried as top-level view metadata that the renderer turns into
|
||||
layout forces. Attraction scores should remain inspectable, with source
|
||||
references and confidence, so the operator can understand why a repository was
|
||||
pulled toward a topic.
|
||||
|
||||
Unlike zones, attractors may overlap. A repository can be close to both
|
||||
`development` and `operations`, and the layout should place it between those
|
||||
poles. Zone resolvers, boundary diagnostics, dependency queries, blast-radius
|
||||
queries, and collapsed-zone boundary edges should ignore semantic attraction
|
||||
edges unless a host explicitly promotes an attractor relation into canonical
|
||||
graph data.
|
||||
|
||||
See `docs/semantic-attractors.md` for the concept model, scoring semantics,
|
||||
payload direction, and implementation path.
|
||||
|
||||
## Display State Ownership
|
||||
|
||||
The contract allows either the host service or the engine to evaluate display
|
||||
|
||||
340
docs/semantic-attractors.md
Normal file
340
docs/semantic-attractors.md
Normal file
@@ -0,0 +1,340 @@
|
||||
# Semantic Attractors
|
||||
|
||||
## Intent
|
||||
|
||||
Semantic attractors are view entities that help an operator orient inside a
|
||||
medium or large graph. An attractor represents a topic, concern, capability
|
||||
area, operating mode, or other conceptual pole such as `security`,
|
||||
`development`, `operations`, `identity`, `data`, or `delivery`.
|
||||
|
||||
The graph explorer can place attractors on the canvas and connect graph
|
||||
entities to them with view-only relationship strength. The stronger an
|
||||
entity's semantic closeness to an attractor, the more that attractor should
|
||||
pull the entity in force-directed or spring-based layouts.
|
||||
|
||||
The first motivating use case is repository orientation. Given a set of
|
||||
repositories, the operator defines attractors such as `security`,
|
||||
`development`, and `operations`. Railiance Fabric reads each repository's
|
||||
`SCOPE.md`, estimates semantic closeness to those attractors, and maps that
|
||||
score to layout force. The resulting map becomes a navigational surface: repos
|
||||
with similar purpose drift toward the same conceptual pole without replacing
|
||||
the underlying dependency or responsibility graph.
|
||||
|
||||
## What Attractors Are
|
||||
|
||||
An attractor is not a fabric node in the source graph. It is a graph-view
|
||||
artifact with these responsibilities:
|
||||
|
||||
- name a topic or concern that is useful for orientation;
|
||||
- define how closeness to that topic is measured;
|
||||
- expose a score for each eligible entity;
|
||||
- translate that score into layout hints and optional visual edges;
|
||||
- keep the scoring evidence inspectable so the map does not become mysterious.
|
||||
|
||||
Attractors should be saved as view/profile configuration, operator presets, or
|
||||
host-provided explorer configuration. They should not mutate repo-owned Fabric
|
||||
declarations, and they should not imply that a repository provides or consumes
|
||||
a capability.
|
||||
|
||||
## Why This Helps
|
||||
|
||||
Dependency edges answer "what depends on what?" Ownership and deployment
|
||||
metadata answer "who owns this?" and "where does this run?" Those questions are
|
||||
necessary, but they can still leave a large repo collection hard to scan.
|
||||
|
||||
Attractors answer a softer question: "what is this near, conceptually?"
|
||||
|
||||
This gives operators a fast way to discover clusters such as:
|
||||
|
||||
- repos that are security-heavy but not obvious from their names;
|
||||
- operations tooling that depends on development systems;
|
||||
- application repos that are unexpectedly close to platform/runtime concerns;
|
||||
- thin adapter repos that sit between two conceptual poles;
|
||||
- orphaned or ambiguous repos that have weak attraction to every known topic.
|
||||
|
||||
## Core Model
|
||||
|
||||
An attractor definition should be serializable and stable:
|
||||
|
||||
```yaml
|
||||
id: security
|
||||
label: Security
|
||||
description: Identity, authorization, secrets, MFA, audit, policy, and trust boundaries.
|
||||
applies_to:
|
||||
layers: [repository]
|
||||
evidence:
|
||||
sources:
|
||||
- type: scope_markdown
|
||||
path: SCOPE.md
|
||||
scoring:
|
||||
method: lexical_semantic_profile
|
||||
anchors:
|
||||
- security
|
||||
- identity
|
||||
- authorization
|
||||
- secrets
|
||||
- audit
|
||||
- policy
|
||||
- mfa
|
||||
negative_anchors:
|
||||
- unrelated
|
||||
normalization:
|
||||
mode: per_entity_softmax
|
||||
layout:
|
||||
min_score: 0.15
|
||||
max_score: 1.0
|
||||
strength_scale: 0.8
|
||||
ideal_length:
|
||||
min: 80
|
||||
max: 420
|
||||
presentation:
|
||||
color: "#be123c"
|
||||
edge_style: dashed
|
||||
```
|
||||
|
||||
The exact schema can evolve, but the responsibilities should remain separate:
|
||||
|
||||
- `applies_to` chooses which graph elements can be scored.
|
||||
- `evidence` declares which text or metadata is used.
|
||||
- `scoring` defines the semantic metric.
|
||||
- `normalization` turns raw scores into comparable view weights.
|
||||
- `layout` maps weights to graph layout hints.
|
||||
- `presentation` controls the optional visual attractor node and edges.
|
||||
|
||||
## Scoring From SCOPE.md
|
||||
|
||||
`SCOPE.md` is a useful first evidence source because it is intentionally short,
|
||||
repo-owned, and written to explain when a repository is relevant. For repository
|
||||
attraction, the scorer should use sections such as:
|
||||
|
||||
- `One-liner`
|
||||
- `Core Idea`
|
||||
- `In Scope`
|
||||
- `Relevant When`
|
||||
- `Provided Capabilities`
|
||||
- `Related / Overlapping Repositories`
|
||||
- `Terminology`
|
||||
|
||||
Sections such as `Out of Scope` and `Not Relevant When` should be used
|
||||
carefully. They can reduce false positives, but they should not erase a topic
|
||||
just because the repo mentions a boundary. For example, a repo can say it is
|
||||
not an authorization engine while still being semantically near security
|
||||
because it models secrets, policy, or trust boundaries.
|
||||
|
||||
The first implementation can use a transparent lexical profile:
|
||||
|
||||
1. Parse `SCOPE.md` into sections.
|
||||
2. Tokenize section text and provided capability keywords.
|
||||
3. Weight section matches, with `One-liner`, `Core Idea`, `In Scope`, and
|
||||
capability keywords carrying more weight than incidental notes.
|
||||
4. Score each attractor by matching configured anchors and related terms.
|
||||
5. Normalize scores per entity so one verbose `SCOPE.md` does not dominate.
|
||||
6. Store the score, confidence, and top evidence snippets in the view payload.
|
||||
|
||||
Later implementations can replace or augment lexical scoring with embeddings,
|
||||
LLM-assisted classification, or operator-reviewed labels. The contract should
|
||||
not depend on a particular scorer.
|
||||
|
||||
## Score Semantics
|
||||
|
||||
Attractor scores should be continuous values in `[0, 1]`.
|
||||
|
||||
Suggested interpretation:
|
||||
|
||||
| Score | Meaning |
|
||||
|-------|---------|
|
||||
| `0.00` | no useful evidence of semantic closeness |
|
||||
| `0.10` to `0.30` | weak signal; useful only as a faint layout hint |
|
||||
| `0.30` to `0.60` | moderate closeness; entity should visibly lean toward the attractor |
|
||||
| `0.60` to `0.85` | strong closeness; entity likely belongs near the attractor cluster |
|
||||
| `0.85` to `1.00` | primary semantic identity or explicit operator label |
|
||||
|
||||
Every score should carry a confidence separate from closeness. A repo with a
|
||||
thin or missing `SCOPE.md` may have low confidence even if a few terms match.
|
||||
|
||||
Attractors should also support multi-attraction. A repository can be close to
|
||||
both `development` and `operations`; the layout should then place it between
|
||||
those poles instead of forcing a single category. This is the main difference
|
||||
from zones: zones preserve a single-surface invariant, while attractors are
|
||||
allowed to overlap because they are layout forces, not containers.
|
||||
|
||||
## Layout Mapping
|
||||
|
||||
Attraction scores become layout hints. They should not become domain edges.
|
||||
|
||||
A graph explorer can map scores to synthetic view edges:
|
||||
|
||||
```json
|
||||
{
|
||||
"data": {
|
||||
"id": "attractor:security->repo:flex-auth",
|
||||
"source": "attractor:security",
|
||||
"target": "repo:flex-auth",
|
||||
"edgeType": "semantic_attraction",
|
||||
"displayOnly": true,
|
||||
"score": 0.82,
|
||||
"confidence": 0.74,
|
||||
"strength": "strong",
|
||||
"layoutAffinity": 0.82,
|
||||
"layoutIdealLength": 110,
|
||||
"layoutElasticity": 0.9,
|
||||
"sourceReferences": [
|
||||
{
|
||||
"type": "scope_markdown",
|
||||
"path": "SCOPE.md",
|
||||
"section": "In Scope"
|
||||
}
|
||||
]
|
||||
},
|
||||
"classes": "semantic-attraction"
|
||||
}
|
||||
```
|
||||
|
||||
For force-directed layouts:
|
||||
|
||||
- stronger scores should increase spring strength or edge weight;
|
||||
- stronger scores should shorten ideal length;
|
||||
- weak scores may be hidden visually while still applying a small force;
|
||||
- edges below a configured threshold should not affect layout;
|
||||
- display-only attraction edges should be excluded from dependency, boundary,
|
||||
blast-radius, and zone-connectivity diagnostics.
|
||||
|
||||
Attractor nodes can be pinned, arranged on a ring, placed by the operator, or
|
||||
computed from the current profile. For first use, a stable radial placement is
|
||||
usually enough: place three to eight attractors around the graph, then let
|
||||
repositories find their balance.
|
||||
|
||||
## View Payload Shape
|
||||
|
||||
The graph explorer payload should be able to carry attractor metadata without
|
||||
changing the canonical Fabric graph.
|
||||
|
||||
Recommended top-level view extension:
|
||||
|
||||
```json
|
||||
{
|
||||
"view": {
|
||||
"attractors": {
|
||||
"enabled": true,
|
||||
"definitionSet": "repo-concerns-v1",
|
||||
"definitions": [
|
||||
{
|
||||
"id": "security",
|
||||
"label": "Security",
|
||||
"description": "Identity, authorization, secrets, audit, and policy.",
|
||||
"color": "#be123c"
|
||||
}
|
||||
],
|
||||
"scores": [
|
||||
{
|
||||
"attractor_id": "security",
|
||||
"element_id": "repo:flex-auth",
|
||||
"score": 0.82,
|
||||
"confidence": 0.74,
|
||||
"method": "lexical_semantic_profile",
|
||||
"evidence": [
|
||||
{
|
||||
"source": "SCOPE.md",
|
||||
"section": "Core Idea",
|
||||
"terms": ["authorization", "policy"]
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The renderer may choose to materialize these into synthetic nodes and edges at
|
||||
runtime. A host may also emit synthetic display-only elements directly if that
|
||||
is easier for the current engine.
|
||||
|
||||
## Operator Workflow
|
||||
|
||||
A useful attractor workflow should feel like mapmaking:
|
||||
|
||||
1. Choose a preset such as `Security / Development / Operations`.
|
||||
2. Review the generated scores and evidence for a few known repos.
|
||||
3. Hide or pin attractors that are not useful for the current question.
|
||||
4. Save the attractor definition set in the graph profile.
|
||||
5. Use the resulting layout to discover ambiguous, central, or misplaced repos.
|
||||
|
||||
The UI should expose:
|
||||
|
||||
- a toggle for semantic attractors;
|
||||
- a definition-set selector;
|
||||
- score threshold controls;
|
||||
- optional visual attraction edges;
|
||||
- pinned/unpinned attractor placement;
|
||||
- detail panels explaining why a repo is close to an attractor;
|
||||
- diagnostics for missing evidence, low confidence, and overly broad
|
||||
attractors.
|
||||
|
||||
## Relationship To Zones
|
||||
|
||||
Zones and attractors solve different orientation problems.
|
||||
|
||||
Zones are bounded drawing surfaces. A visible node belongs to zero or one zone
|
||||
in a given view. They are useful for deployment environments, access zones,
|
||||
ownership surfaces, and other container-like questions.
|
||||
|
||||
Attractors are semantic force points. A visible node can be pulled by multiple
|
||||
attractors at once. They are useful for topical orientation, concern mapping,
|
||||
and discovering conceptual neighborhoods.
|
||||
|
||||
The two concepts can combine cleanly:
|
||||
|
||||
- zones can show where entities run;
|
||||
- attractors can pull repos inside or outside those zones based on semantic
|
||||
concern;
|
||||
- zone diagnostics should ignore semantic attraction edges unless explicitly
|
||||
configured otherwise;
|
||||
- attractor scores can be summarized inside zone details.
|
||||
|
||||
## Initial Presets
|
||||
|
||||
A first repository-orientation preset should keep the set small:
|
||||
|
||||
| Attractor | Topic Signal |
|
||||
|-----------|--------------|
|
||||
| `security` | identity, secrets, authorization, policy, audit, MFA, trust boundaries |
|
||||
| `development` | source code, build, CI/CD, package publishing, scaffolding, developer workflows |
|
||||
| `operations` | deployment, runtime, monitoring, backups, incidents, infrastructure lifecycle |
|
||||
|
||||
Useful follow-up presets:
|
||||
|
||||
- `data`, `identity`, `delivery`, `governance`
|
||||
- `platform`, `application`, `tooling`
|
||||
- `financial`, `runtime`, `coordination`
|
||||
|
||||
Attractors should start as operator-chosen presets rather than global truth.
|
||||
The same repository can be viewed through different conceptual lenses.
|
||||
|
||||
## Implementation Path
|
||||
|
||||
The concept can be implemented incrementally:
|
||||
|
||||
1. Add an attractor definition format for graph explorer profiles.
|
||||
2. Parse repo `SCOPE.md` files during registry sync or graph export.
|
||||
3. Compute transparent lexical scores for repositories.
|
||||
4. Include attractor scores and evidence in the graph explorer payload.
|
||||
5. Add synthetic attractor nodes and display-only attraction edges in the UI.
|
||||
6. Map attraction scores to layout hints for the force-directed layout.
|
||||
7. Add detail-panel evidence and low-confidence diagnostics.
|
||||
8. Support saved attractor presets and operator score overrides.
|
||||
|
||||
This keeps attractors as a view concern until the scoring model proves useful.
|
||||
If a semantic relation becomes durable domain knowledge, it can later be
|
||||
promoted into a proper Fabric declaration with separate evidence and review.
|
||||
|
||||
## Open Questions
|
||||
|
||||
- Should attractor definitions live in graph profiles, repo config, or a shared
|
||||
registry preset file?
|
||||
- Should scoring run during registry sync, export, or entirely in the browser?
|
||||
- How much operator override should be allowed before scores become maintained
|
||||
labels rather than computed evidence?
|
||||
- What is the right default for missing or stale `SCOPE.md` evidence?
|
||||
- Should the first implementation use only lexical scoring, or should it also
|
||||
prepare a pluggable embedding scorer interface?
|
||||
Reference in New Issue
Block a user