Files
the-custodian/specs/RepoClassificationStandard.md

1096 lines
24 KiB
Markdown
Executable File

# Repo Classification Standard
**Status:** Draft v1.0
**Scope:** Helix Forge, based and connected information and code repositories
**Purpose:** Provide a simple, stable, and practical classification model for clustering repositories by work category, intended market domain, capabilities, and business responsibility.
---
## 1. Intent
The Repo Classification Standard defines a compact metadata model for organizing repositories across exploratory, research, product, platform, and business work.
It is intended to support:
- repo discovery and clustering,
- product portfolio navigation,
- capability registry views,
- strategic planning,
- agentic coding workflows,
- product and business maturity reviews,
- prioritization across Coulomb, Helix Forge, Coulomb Social, and related efforts.
The standard separates four concerns that are often mixed together:
1. **Category** — what kind of work is this repo?
2. **Domain** — who is this primarily for?
3. **Capabilities** — what does this repo do or enable?
4. **Business stake** — which business responsibilities does this repo affect or support?
---
## 2. Core Classification Schema
Every classified repository SHOULD include the following metadata block.
```yaml
repo_classification:
category: project
domain: infotech
secondary_domains: []
capability_tags: []
business_stake: []
business_mechanics: []
```
A fuller example:
```yaml
repo_classification:
category: product
domain: communication
secondary_domains:
- financials
- agents
capability_tags:
- social-network
- marketplace
- challenges
- reputation
- collaboration
business_stake:
- product
- experience
- sales
- technology
- automation
- intelligence
business_mechanics:
- intention
- coordination
- operation
- adaptation
```
---
## 3. Field Overview
| Field | Required | Cardinality | Purpose |
|---|---:|---:|---|
| `category` | yes | exactly 1 | Work mode, maturity, or organizational purpose of the repo. |
| `domain` | yes | exactly 1 | Primary intended customer, user, or market domain. |
| `secondary_domains` | no | 0..n | Other domains strongly affected or served. |
| `capability_tags` | no | 0..n | Functional or architectural capabilities provided by the repo. |
| `business_stake` | no | 0..n | Business responsibility areas that care about, sponsor, operate, or benefit from the repo. |
| `business_mechanics` | no | 0..n | Viable-business mechanics supported by the repo. |
---
## 4. Classification Principles
### 4.1 Use one primary category
Each repo MUST have exactly one `category`.
The category answers:
> What kind of work is this repo right now?
A repo may evolve from one category to another over time. For example, an `experimental` repo may become a `project`, then a `product`.
### 4.2 Use one primary domain
Each repo MUST have exactly one `domain`.
The domain answers:
> Who is this primarily for?
Classify by intended users, customers, or market context — not by internal implementation detail.
Example:
```yaml
# AI tool for hospitals
category: product
domain: health
secondary_domains:
- agents
```
The implementation uses AI, but the primary domain is `health` because the intended users are in healthcare.
### 4.3 Use secondary domains sparingly
Use `secondary_domains` only when the repo has a meaningful second market/user context.
Do not add every technically related domain. The list should improve navigation, not become noise.
Recommended maximum: **3 secondary domains**.
### 4.4 Use capability tags for what the repo does
Things like `identity`, `knowledge`, `citations`, `platform`, `governance`, `marketplace`, and `coordination` SHOULD usually be capability tags, not domains.
### 4.5 Use business stake for organizational relevance
The `business_stake` field identifies which business perspectives are materially involved.
It should answer:
> Which business responsibility areas need to understand, fund, use, operate, govern, sell, or improve this repo?
---
## 5. Categories
Allowed values:
```yaml
category:
- experimental
- research
- project
- product
- business
```
### 5.1 `experimental`
Use for spikes, prototypes, playgrounds, speculative implementations, or early technical experiments.
Main question:
> Can this work?
Typical signs:
- unclear scope,
- unstable architecture,
- created to test an idea,
- may be abandoned without consequence,
- not yet intended for serious reuse.
Examples:
```yaml
category: experimental
```
### 5.2 `research`
Use for knowledge work, standards, canons, terminology, market research, technical exploration, architectural inquiry, and decision support.
Main question:
> What do we need to understand?
Typical signs:
- produces documents, taxonomies, standards, evaluations, or reference material,
- may seed later products or projects,
- not primarily a deployable product,
- often contains `INTENT.md`, research notes, standards, catalogs, or decision records.
Examples:
```yaml
category: research
```
### 5.3 `project`
Use for concrete implementation work with a bounded goal, but not yet productized as a repeatable offering.
Main question:
> Can we build this?
Typical signs:
- defined implementation goal,
- specific deliverables,
- may be internal or external,
- may become a product later,
- may depend on active development milestones.
Examples:
```yaml
category: project
```
### 5.4 `product`
Use for reusable, offerable products or product components with intended users or customers.
Main question:
> Can users or customers use and value this repeatedly?
Typical signs:
- stable user or customer intent,
- repeatable value proposition,
- product requirements exist or are emerging,
- may have pricing, onboarding, support, releases, documentation, UX, or roadmap concerns,
- intended to be used beyond a one-off project.
Examples:
```yaml
category: product
```
### 5.5 `business`
Use for repos organizing commercial, operational, legal, strategic, financial, or organizational activity.
Main question:
> Can this become or support a viable business?
Typical signs:
- company setup,
- business model design,
- legal/compliance work,
- finance planning,
- sales enablement,
- operating model,
- strategy,
- partner management,
- product portfolio control.
Examples:
```yaml
category: business
```
---
## 6. Domains
Allowed values:
```yaml
domain:
- infotech
- financials
- communication
- consumer
- health
- industrials
- energy
- utilities
- materials
- realestate
- crypto
- agents
- space
- government
```
### 6.1 Domain Selection Rule
Choose the primary domain by answering this question:
> If this repo becomes successful, which customer/user sector primarily benefits from it?
Do not classify by technology stack unless the technology stack itself is the product.
Examples:
```yaml
# Generic developer platform
domain: infotech
# AI-based feed editor for general users
domain: communication
secondary_domains:
- agents
# AI orchestration infrastructure for autonomous software work
domain: agents
secondary_domains:
- infotech
# Blockchain registry for property ownership
domain: realestate
secondary_domains:
- crypto
- financials
```
### 6.2 Domain Definitions
| Domain | Use for |
|---|---|
| `infotech` | Software, IT infrastructure, developer tools, identity, security, data, knowledge systems, platform engineering, DevOps, architecture tooling. |
| `financials` | Banking, accounting, payments, investing, insurance, pricing, monetization, bounties, economic control systems. |
| `communication` | Social platforms, messaging, publishing, content feeds, collaboration, marketing, content exchange, human coordination. |
| `consumer` | Personal tools, lifestyle, home, family, education, entertainment, non-specialized B2C services. |
| `health` | Healthcare, elderly care, medical workflows, wellbeing, care coordination, health-adjacent support systems. |
| `industrials` | Manufacturing, logistics, transport, rail, physical production, machinery, supply chains, industrial operations. |
| `energy` | Energy generation, storage, energy markets, fuels, batteries, grid-related energy products. |
| `utilities` | Water, waste, public utilities, municipal infrastructure, regulated local service systems. |
| `materials` | Chemicals, raw materials, advanced materials, mining, recycling, material science. |
| `realestate` | Housing, property ownership, rental models, facilities, buildings, property operations. |
| `crypto` | Blockchain, tokens, smart contracts, decentralized protocols, crypto-native economics. |
| `agents` | AI agents, LLM orchestration, autonomous workflows, model routing, agentic collaboration, AI-native systems. |
| `space` | Space industry, satellites, orbital systems, launch, space logistics, space simulation. |
| `government` | B2G, civic infrastructure, public administration, procurement, citizen services, regulation-facing systems. |
---
## 7. Capability Tags
The `capability_tags` field captures what the repo does or enables.
Capability tags are intentionally open-ended, but SHOULD follow these rules:
- use lowercase kebab-case,
- prefer nouns or noun phrases,
- avoid synonyms when a canonical tag exists,
- avoid business sectors already covered by `domain`,
- use tags to improve search, filtering, and clustering.
Examples:
```yaml
capability_tags:
- identity
- access-control
- governance
- knowledge
- citations
- evidence
- platform
- marketplace
- procurement
- coordination
- operation
- adaptation
- capability-registry
- feature-control
- pricing
- reputation
- social-network
- workflow
- orchestration
- automation
- observability
- security
- policy
- compliance
- product-development
```
### 7.1 Recommended Capability Families
Capability tags may be grouped into families for navigation.
```yaml
capability_families:
identity_and_access:
- identity
- authentication
- authorization
- access-control
- user-management
- tenancy
knowledge_and_evidence:
- knowledge
- citations
- evidence
- source-management
- traceability
- documentation
platform_and_operations:
- platform
- deployment
- operations
- observability
- feature-control
- configuration
- orchestration
market_and_coordination:
- marketplace
- pricing
- reputation
- challenges
- bounties
- collaboration
- coordination
governance_and_control:
- governance
- policy
- compliance
- risk
- audit
- control
```
---
## 8. Business Stake
The `business_stake` field replaces the older `stakeholder_tags` concept.
Allowed values:
```yaml
business_stake:
- execution
- intelligence
- finance
- legal
- sales
- experience
- technology
- operations
- product
- people
- procurement
- sustainability
- automation
```
### 8.1 Business Stake Definitions
| Business stake | Meaning |
|---|---|
| `execution` | Delivery, follow-through, implementation, project progress, getting things done. |
| `intelligence` | Research, analysis, sensing, evidence, decision support, strategic insight. |
| `finance` | Revenue, cost, pricing, accounting, investment, financial stability, controlling. |
| `legal` | Legal structure, contracts, compliance, regulation, liability, governance obligations. |
| `sales` | Customer acquisition, pipeline, partner management, commercial offer communication. |
| `experience` | User experience, customer experience, service experience, adoption, usability. |
| `technology` | Architecture, software engineering, infrastructure, security, data, technical quality. |
| `operations` | Running services, process reliability, quality management, support, service delivery. |
| `product` | Product strategy, requirements, roadmap, positioning, value proposition, lifecycle. |
| `people` | Human resources, teams, roles, skills, contributors, community participants. |
| `procurement` | Suppliers, purchasing, sourcing, vendor relationships, supply chain. |
| `sustainability` | Ecological, social, and long-term responsibility, CSR, resource impact. |
| `automation` | Agentic work, process automation, AI-assisted execution, workflow acceleration. |
### 8.2 Business Stake Usage Rule
Use `business_stake` when a business area is materially relevant.
Do not add all values by default. A good classification usually has **2 to 6 business stake values**.
Examples:
```yaml
# A developer platform repo
business_stake:
- technology
- product
- operations
- automation
# A pricing product repo
business_stake:
- finance
- product
- sales
- intelligence
# A legal/company-setup repo
business_stake:
- legal
- finance
- execution
```
---
## 9. Business Mechanics
The `business_mechanics` field is optional. It captures the viable-business mechanics supported by the repo.
Allowed values:
```yaml
business_mechanics:
- intention
- control
- coordination
- operation
- adaptation
```
### 9.1 Business Mechanics Definitions
| Mechanic | Meaning |
|---|---|
| `intention` | Defines purpose, direction, offers, goals, or strategic intent. |
| `control` | Provides steering, governance, rules, constraints, thresholds, or decision authority. |
| `coordination` | Aligns actors, tasks, commitments, dependencies, communication, or collaboration. |
| `operation` | Supports repeated execution, delivery, runtime activity, service management, or production. |
| `adaptation` | Supports learning, feedback, change, sensing, improvement, or evolution. |
Use this field when the repo contributes to business viability beyond its technical capability.
---
## 10. Recommended Repo Metadata File
Each repo SHOULD contain a file named:
```text
.repo-classification.yaml
```
Recommended minimum content:
```yaml
repo_classification:
category: project
domain: infotech
secondary_domains: []
capability_tags: []
business_stake: []
business_mechanics: []
```
Recommended enriched content:
```yaml
repo_classification:
standard: Coulomb Repo Classification Standard
version: 1.0
classified_at: 2026-06-22
classified_by: human
category: product
domain: infotech
secondary_domains:
- agents
capability_tags:
- platform
- capability-registry
- coordination
- product-development
business_stake:
- product
- technology
- execution
- automation
- intelligence
business_mechanics:
- intention
- coordination
- operation
- adaptation
notes: >
Primary classification is based on intended users/customers, not implementation details.
```
---
## 11. Classification Decision Procedure
Use this procedure when classifying a repo manually or with an agent.
### Step 1: Identify the current category
Ask:
1. Is this mainly a spike or prototype? → `experimental`
2. Is this mainly knowledge, terminology, research, or standards? → `research`
3. Is this a bounded implementation effort? → `project`
4. Is this reusable and offerable to users/customers? → `product`
5. Is this about commercial, legal, financial, strategic, or organizational viability? → `business`
### Step 2: Identify the primary domain
Ask:
1. Who primarily benefits if this succeeds?
2. Who would pay, adopt, regulate, use, or depend on it?
3. Which market/user sector best describes that group?
Choose exactly one primary `domain`.
### Step 3: Add secondary domains
Ask:
1. Are there other strongly relevant markets?
2. Would omitting them make the repo hard to find?
3. Are they user/customer domains rather than implementation details?
Add only the strongest secondary domains.
### Step 4: Add capability tags
Ask:
1. What does this repo enable?
2. Which reusable capabilities does it provide?
3. Which architectural or functional areas does it belong to?
Use lowercase kebab-case.
### Step 5: Add business stake
Ask:
1. Which business areas need this repo?
2. Which business areas are affected by its success or failure?
3. Which business areas should review or govern it?
Add 2 to 6 values where possible.
### Step 6: Add business mechanics
Ask:
1. Does this repo define intention?
2. Does it provide control?
3. Does it coordinate actors or work?
4. Does it support operation?
5. Does it enable adaptation?
Add only the mechanics that materially apply.
---
## 12. Validation Checklist
A repo classification is valid when:
- [ ] `category` exists and has exactly one allowed value.
- [ ] `domain` exists and has exactly one allowed value.
- [ ] `secondary_domains` contains only allowed domain values.
- [ ] `secondary_domains` does not repeat the primary `domain`.
- [ ] `capability_tags` use lowercase kebab-case.
- [ ] `business_stake` contains only allowed values.
- [ ] `business_mechanics` contains only allowed values.
- [ ] The primary domain is based on intended users/customers, not implementation detail.
- [ ] The classification helps discovery and does not create unnecessary noise.
---
## 13. Example Classifications
### 13.1 Helix Forge
```yaml
repo_classification:
category: product
domain: infotech
secondary_domains:
- agents
capability_tags:
- platform
- capability-registry
- coordination
- knowledge
- product-development
business_stake:
- product
- technology
- execution
- automation
- intelligence
business_mechanics:
- intention
- coordination
- operation
- adaptation
```
### 13.2 Coulomb Social
```yaml
repo_classification:
category: product
domain: communication
secondary_domains:
- financials
- agents
capability_tags:
- social-network
- marketplace
- challenges
- reputation
- collaboration
business_stake:
- product
- experience
- sales
- technology
- automation
- intelligence
business_mechanics:
- intention
- coordination
- operation
- adaptation
```
### 13.3 Identity Canon
```yaml
repo_classification:
category: research
domain: infotech
secondary_domains:
- government
capability_tags:
- identity
- access-control
- terminology
- canon
- governance
business_stake:
- technology
- legal
- operations
- intelligence
business_mechanics:
- intention
- control
- adaptation
```
### 13.4 NetKingdom
```yaml
repo_classification:
category: product
domain: infotech
secondary_domains: []
capability_tags:
- security
- identity
- platform
- operations
- access-control
business_stake:
- technology
- operations
- legal
- automation
business_mechanics:
- control
- operation
- adaptation
```
### 13.5 Citation Evidence
```yaml
repo_classification:
category: product
domain: infotech
secondary_domains:
- communication
- government
capability_tags:
- citations
- evidence
- knowledge
- traceability
- source-management
business_stake:
- intelligence
- legal
- product
- technology
business_mechanics:
- control
- coordination
- adaptation
```
### 13.6 Adaptive Pricing
```yaml
repo_classification:
category: product
domain: financials
secondary_domains:
- infotech
- agents
capability_tags:
- pricing
- monetization
- lifecycle
- decision-support
- product-development
business_stake:
- finance
- product
- sales
- intelligence
- automation
business_mechanics:
- intention
- control
- adaptation
```
### 13.7 Reuse Surface
```yaml
repo_classification:
category: product
domain: infotech
secondary_domains:
- agents
capability_tags:
- capability-registry
- discovery
- reuse
- maturity
- evidence
business_stake:
- technology
- product
- intelligence
- automation
business_mechanics:
- intention
- control
- adaptation
```
### 13.8 Family Home
```yaml
repo_classification:
category: business
domain: realestate
secondary_domains:
- financials
capability_tags:
- rental-to-own
- ownership
- legal-structure
- housing
business_stake:
- finance
- legal
- sales
- operations
- sustainability
business_mechanics:
- intention
- control
- operation
- adaptation
```
### 13.9 Hallo Oma
```yaml
repo_classification:
category: product
domain: health
secondary_domains:
- communication
- consumer
capability_tags:
- elderly-care
- video-calling
- family-support
- emergency-checkin
business_stake:
- product
- experience
- operations
- technology
- sustainability
business_mechanics:
- coordination
- operation
- adaptation
```
---
## 14. Anti-Patterns
### 14.1 Mixing category and domain
Do not use `research` as a domain or `health` as a category.
Bad:
```yaml
category: health
```
Good:
```yaml
category: research
domain: health
```
### 14.2 Classifying by implementation detail
Bad:
```yaml
# A healthcare scheduling AI
domain: agents
```
Good:
```yaml
domain: health
secondary_domains:
- agents
```
### 14.3 Overusing secondary domains
Bad:
```yaml
secondary_domains:
- infotech
- financials
- communication
- consumer
- agents
- government
```
Good:
```yaml
secondary_domains:
- agents
- government
```
### 14.4 Using vague capability tags
Bad:
```yaml
capability_tags:
- stuff
- misc
- tool
- important
```
Good:
```yaml
capability_tags:
- identity
- access-control
- audit
- policy
```
---
## 15. Migration Notes from Older Statehub / Coulomb Perspectives
Older labels such as `Identity`, `Knowledge`, `Citations`, `Capabilities`, `Governance`, `Platform`, `Communication`, and `Experimental` mixed several different classification concerns.
Recommended migration:
| Old perspective | New placement |
|---|---|
| `Experimental` | `category: experimental` |
| `Identity` | `capability_tags: [identity]` and usually `domain: infotech` |
| `Knowledge` | `capability_tags: [knowledge]` |
| `Citations` | `capability_tags: [citations, evidence]` |
| `Capabilities` | `capability_tags: [capability-registry]` or related tags |
| `Governance` | `capability_tags: [governance]`, often `business_stake: [legal, operations, technology]` |
| `Platform` | `capability_tags: [platform]`, often `domain: infotech` |
| `Communication` | `domain: communication` if it describes the intended market; otherwise `capability_tags: [communication]` |
| `Civic` | usually `domain: government` |
| `Procurement` | `business_stake: [procurement]`, and sometimes `capability_tags: [procurement]` |
| `Marketplace` | `capability_tags: [marketplace]`, often `domain: financials` or `communication` depending on user/customer context |
---
## 16. Agent Prompt for Repo Classification
Use the following prompt for agent-assisted classification.
```text
Classify this repository according to the Repo Classification Standard.
Return a YAML block with:
- category: one of experimental, research, project, product, business
- domain: one of infotech, financials, communication, consumer, health, industrials, energy, utilities, materials, realestate, crypto, agents, space, government
- secondary_domains: zero or more allowed domains, excluding the primary domain
- capability_tags: lowercase kebab-case tags describing what the repo does or enables
- business_stake: zero or more of execution, intelligence, finance, legal, sales, experience, technology, operations, product, people, procurement, sustainability, automation
- business_mechanics: zero or more of intention, control, coordination, operation, adaptation
Classify by intended users/customers rather than implementation detail.
Use secondary domains sparingly.
Provide a brief rationale after the YAML.
```
---
## 17. Versioning
This standard SHOULD be versioned semantically.
- Patch version: wording clarifications, examples, typo fixes.
- Minor version: new optional fields, new examples, non-breaking guidance.
- Major version: renamed fields, changed allowed values, changed classification semantics.
Current version:
```yaml
standard: Repo Classification Standard
version: 1.0
```
---
## 18. Minimal Template
```yaml
repo_classification:
standard: Repo Classification Standard
version: 1.0
category: project
domain: infotech
secondary_domains: []
capability_tags: []
business_stake: []
business_mechanics: []
```
---
## 19. Compact Human Summary
Use this model as follows:
```text
Category tells what kind of work the repo is.
Domain tells who it is primarily for.
Capability tags tell what it does.
Business stake tells who in the business should care.
Business mechanics tell how it contributes to viable business behavior.
```