# Semantic Attractors

## Intent

Semantic attractors are view entities that help an operator orient inside a
medium or large graph. An attractor represents a topic, concern, capability
area, operating mode, or other conceptual pole such as `security`,
`development`, `operations`, `identity`, `data`, or `delivery`.

The graph explorer can place attractors on the canvas and connect graph
entities to them with view-only relationship strength. The stronger an
entity's semantic closeness to an attractor, the more that attractor should
pull the entity in force-directed or spring-based layouts.

The first motivating use case is repository orientation. Given a set of
repositories, the operator defines attractors such as `security`,
`development`, and `operations`. Railiance Fabric reads each repository's
`SCOPE.md`, estimates semantic closeness to those attractors, and maps that
score to layout force. The resulting map becomes a navigational surface: repos
with similar purpose drift toward the same conceptual pole without replacing
the underlying dependency or responsibility graph.

## What Attractors Are

An attractor is not a fabric node in the source graph. It is a graph-view
artifact with these responsibilities:

- name a topic or concern that is useful for orientation;
- define how closeness to that topic is measured;
- expose a score for each eligible entity;
- translate that score into layout hints and optional visual edges;
- keep the scoring evidence inspectable so the map does not become mysterious.

Attractors should be saved as view/profile configuration, operator presets, or
host-provided explorer configuration. They should not mutate repo-owned Fabric
declarations, and they should not imply that a repository provides or consumes
a capability.

## Why This Helps

Dependency edges answer "what depends on what?" Ownership and deployment
metadata answer "who owns this?" and "where does this run?" Those questions are
necessary, but they can still leave a large repo collection hard to scan.

Attractors answer a softer question: "what is this near, conceptually?"

This gives operators a fast way to discover clusters such as:

- repos that are security-heavy but not obvious from their names;
- operations tooling that depends on development systems;
- application repos that are unexpectedly close to platform/runtime concerns;
- thin adapter repos that sit between two conceptual poles;
- orphaned or ambiguous repos that have weak attraction to every known topic.

## Core Model

An attractor definition should be serializable and stable:

```yaml
id: security
label: Security
description: Identity, authorization, secrets, MFA, audit, policy, and trust boundaries.
applies_to:
  layers: [repository]
evidence:
  sources:
    - type: scope_markdown
      path: SCOPE.md
scoring:
  method: lexical_semantic_profile
  anchors:
    - security
    - identity
    - authorization
    - secrets
    - audit
    - policy
    - mfa
  negative_anchors:
    - unrelated
normalization:
  mode: per_entity_softmax
layout:
  min_score: 0.15
  max_score: 1.0
  strength_scale: 0.8
  ideal_length:
    min: 80
    max: 420
presentation:
  color: "#be123c"
  edge_style: dashed
```

The exact schema can evolve, but the responsibilities should remain separate:

- `applies_to` chooses which graph elements can be scored.
- `evidence` declares which text or metadata is used.
- `scoring` defines the semantic metric.
- `normalization` turns raw scores into comparable view weights.
- `layout` maps weights to graph layout hints.
- `presentation` controls the optional visual attractor node and edges.

## Scoring From SCOPE.md

`SCOPE.md` is a useful first evidence source because it is intentionally short,
repo-owned, and written to explain when a repository is relevant. For repository
attraction, the scorer should use sections such as:

- `One-liner`
- `Core Idea`
- `In Scope`
- `Relevant When`
- `Provided Capabilities`
- `Related / Overlapping Repositories`
- `Terminology`

Sections such as `Out of Scope` and `Not Relevant When` should be used
carefully. They can reduce false positives, but they should not erase a topic
just because the repo mentions a boundary. For example, a repo can say it is
not an authorization engine while still being semantically near security
because it models secrets, policy, or trust boundaries.

The first implementation can use a transparent lexical profile:

1. Parse `SCOPE.md` into sections.
2. Tokenize section text and provided capability keywords.
3. Weight section matches, with `One-liner`, `Core Idea`, `In Scope`, and
   capability keywords carrying more weight than incidental notes.
4. Score each attractor by matching configured anchors and related terms.
5. Normalize scores per entity so one verbose `SCOPE.md` does not dominate.
6. Store the score, confidence, and top evidence snippets in the view payload.

Later implementations can replace or augment lexical scoring with embeddings,
LLM-assisted classification, or operator-reviewed labels. The contract should
not depend on a particular scorer.

## Score Semantics

Attractor scores should be continuous values in `[0, 1]`.

Suggested interpretation:

| Score | Meaning |
|-------|---------|
| `0.00` | no useful evidence of semantic closeness |
| `0.10` to `0.30` | weak signal; useful only as a faint layout hint |
| `0.30` to `0.60` | moderate closeness; entity should visibly lean toward the attractor |
| `0.60` to `0.85` | strong closeness; entity likely belongs near the attractor cluster |
| `0.85` to `1.00` | primary semantic identity or explicit operator label |

Every score should carry a confidence separate from closeness. A repo with a
thin or missing `SCOPE.md` may have low confidence even if a few terms match.

Attractors should also support multi-attraction. A repository can be close to
both `development` and `operations`; the layout should then place it between
those poles instead of forcing a single category. This is the main difference
from zones: zones preserve a single-surface invariant, while attractors are
allowed to overlap because they are layout forces, not containers.

## Layout Mapping

Attraction scores become layout hints. They should not become domain edges.

A graph explorer can map scores to synthetic view edges:

```json
{
  "data": {
    "id": "attractor:security->repo:flex-auth",
    "source": "attractor:security",
    "target": "repo:flex-auth",
    "edgeType": "semantic_attraction",
    "displayOnly": true,
    "score": 0.82,
    "confidence": 0.74,
    "strength": "strong",
    "layoutAffinity": 0.82,
    "layoutIdealLength": 110,
    "layoutElasticity": 0.9,
    "sourceReferences": [
      {
        "type": "scope_markdown",
        "path": "SCOPE.md",
        "section": "In Scope"
      }
    ]
  },
  "classes": "semantic-attraction"
}
```

For force-directed layouts:

- stronger scores should increase spring strength or edge weight;
- stronger scores should shorten ideal length;
- weak scores may be hidden visually while still applying a small force;
- edges below a configured threshold should not affect layout;
- display-only attraction edges should be excluded from dependency, boundary,
  blast-radius, and zone-connectivity diagnostics.

Attractor nodes can be pinned, arranged on a ring, placed by the operator, or
computed from the current profile. For first use, a stable radial placement is
usually enough: place three to eight attractors around the graph, then let
repositories find their balance.

## View Payload Shape

The graph explorer payload should be able to carry attractor metadata without
changing the canonical Fabric graph.

Recommended top-level view extension:

```json
{
  "view": {
    "attractors": {
      "enabled": true,
      "definitionSet": "repo-concerns-v1",
      "definitions": [
        {
          "id": "security",
          "label": "Security",
          "description": "Identity, authorization, secrets, audit, and policy.",
          "color": "#be123c"
        }
      ],
      "scores": [
        {
          "attractor_id": "security",
          "element_id": "repo:flex-auth",
          "score": 0.82,
          "confidence": 0.74,
          "method": "lexical_semantic_profile",
          "evidence": [
            {
              "source": "SCOPE.md",
              "section": "Core Idea",
              "terms": ["authorization", "policy"]
            }
          ]
        }
      ]
    }
  }
}
```

The renderer may choose to materialize these into synthetic nodes and edges at
runtime. A host may also emit synthetic display-only elements directly if that
is easier for the current engine.

## Operator Workflow

A useful attractor workflow should feel like mapmaking:

1. Choose a preset such as `Security / Development / Operations`.
2. Review the generated scores and evidence for a few known repos.
3. Hide or pin attractors that are not useful for the current question.
4. Save the attractor definition set in the graph profile.
5. Use the resulting layout to discover ambiguous, central, or misplaced repos.

The UI should expose:

- a toggle for semantic attractors;
- a definition-set selector;
- score threshold controls;
- optional visual attraction edges;
- pinned/unpinned attractor placement;
- detail panels explaining why a repo is close to an attractor;
- diagnostics for missing evidence, low confidence, and overly broad
  attractors.

## Relationship To Zones

Zones and attractors solve different orientation problems.

Zones are bounded drawing surfaces. A visible node belongs to zero or one zone
in a given view. They are useful for deployment environments, access zones,
ownership surfaces, and other container-like questions.

Attractors are semantic force points. A visible node can be pulled by multiple
attractors at once. They are useful for topical orientation, concern mapping,
and discovering conceptual neighborhoods.

The two concepts can combine cleanly:

- zones can show where entities run;
- attractors can pull repos inside or outside those zones based on semantic
  concern;
- zone diagnostics should ignore semantic attraction edges unless explicitly
  configured otherwise;
- attractor scores can be summarized inside zone details.

## Initial Presets

A first repository-orientation preset should keep the set small:

| Attractor | Topic Signal |
|-----------|--------------|
| `security` | identity, secrets, authorization, policy, audit, MFA, trust boundaries |
| `development` | source code, build, CI/CD, package publishing, scaffolding, developer workflows |
| `operations` | deployment, runtime, monitoring, backups, incidents, infrastructure lifecycle |

Useful follow-up presets:

- `data`, `identity`, `delivery`, `governance`
- `platform`, `application`, `tooling`
- `financial`, `runtime`, `coordination`

Attractors should start as operator-chosen presets rather than global truth.
The same repository can be viewed through different conceptual lenses.

## Implementation Path

The concept can be implemented incrementally:

1. Add an attractor definition format for graph explorer profiles.
2. Parse repo `SCOPE.md` files during registry sync or graph export.
3. Compute transparent lexical scores for repositories.
4. Include attractor scores and evidence in the graph explorer payload.
5. Add synthetic attractor nodes and display-only attraction edges in the UI.
6. Map attraction scores to layout hints for the force-directed layout.
7. Add detail-panel evidence and low-confidence diagnostics.
8. Support saved attractor presets and operator score overrides.

This keeps attractors as a view concern until the scoring model proves useful.
If a semantic relation becomes durable domain knowledge, it can later be
promoted into a proper Fabric declaration with separate evidence and review.

## Open Questions

- Should attractor definitions live in graph profiles, repo config, or a shared
  registry preset file?
- Should scoring run during registry sync, export, or entirely in the browser?
- How much operator override should be allowed before scores become maintained
  labels rather than computed evidence?
- What is the right default for missing or stale `SCOPE.md` evidence?
- Should the first implementation use only lexical scoring, or should it also
  prepare a pluggable embedding scorer interface?