Added specification files

This commit is contained in:
2026-03-11 21:29:59 +01:00
parent 06a5c8aef1
commit dc1422fcaa
4 changed files with 1456 additions and 0 deletions

208
wiki/OpsBridge.md Normal file
View File

@@ -0,0 +1,208 @@
OpsBridge
*Operations access for humans and agents*
# OpsBridge
**Operations Access Bridges for Humans and Automation Agents**
Modern IT infrastructure is automated, declarative, and continuously deployed.
But when something breaks, real systems rarely behave exactly as expected.
Operators need to **inspect, diagnose, and repair the running system** — not the theoretical one described in infrastructure code.
**OpsBridge** provides a lightweight way to create **controlled operational access paths** between systems so humans and automation agents can investigate and resolve issues in live environments.
It is designed for the moment when **intent meets reality**.
---
# Why OpsBridge Exists
Infrastructure teams increasingly rely on:
* Infrastructure as Code
* GitOps pipelines
* Kubernetes and cloud orchestration
* automated remediation
* AI-assisted diagnostics
These systems define **how infrastructure should behave**.
But operators deal with **how it actually behaves**.
The gap between these two worlds creates practical problems:
* debugging access requires ad-hoc SSH commands
* operators rely on shell history or tribal knowledge
* automation agents struggle to navigate infrastructure
* incident response becomes slow and inconsistent
OpsBridge provides a **simple operational layer** that makes access paths explicit, observable, and reusable.
---
# What OpsBridge Does
OpsBridge manages **Access Bridges for Operations Tasks**.
An access bridge is a temporary and controlled connectivity path between systems used for operations work.
Example:
```
Remote diagnostic host
│ HTTP request
reverse SSH bridge
local control service
```
OpsBridge lets operators and agents:
* create bridges
* inspect active bridges
* reconnect bridges automatically
* associate bridges with actors
* track operational access events
All without introducing a VPN, overlay network, or heavy access platform.
---
# Built for Human Operators and AI Agents
OpsBridge treats **humans and automation as first-class actors**.
Modern operations increasingly involve:
* diagnostic agents
* automated remediation
* AI-assisted debugging
* ephemeral execution environments
OpsBridge makes it possible to safely give these systems the **temporary access they need to understand and repair infrastructure**.
Every bridge is associated with an actor, making operational activity observable and attributable.
---
# Introducing OpsCatalog
OpsBridge works even better when paired with **OpsCatalog**, a Git-based repository that captures the operational view of infrastructure.
Where DevOps tools describe **how infrastructure should exist**, OpsCatalog captures **how operators actually interact with it**.
OpsCatalog defines:
* operational domains
* infrastructure targets
* operational bridges
* debugging entry points
* operational notes and procedures
Together, OpsBridge and OpsCatalog provide a shared operational map that helps teams navigate real infrastructure.
---
# A New Layer in the Infrastructure Stack
OpsBridge fits between infrastructure automation and real-world operations.
```
Infrastructure as Code
│ expected state
OpsCatalog
│ operations knowledge
OpsBridge
│ access bridges
Live Infrastructure
```
This layer allows operators and automation systems to work with **the infrastructure that actually exists**, not just the one defined in configuration.
---
# Designed for Practical Operations
OpsBridge focuses on simplicity.
It is:
* lightweight
* CLI-driven
* infrastructure-agnostic
* automation-friendly
* identity-integrated
It integrates with existing systems such as identity providers without replacing them.
No new network layer.
No complex access gateway.
Just controlled operational access when you need it.
---
# Example Workflow
Start a bridge:
```
ob up hostA=hostB
```
Check active bridges:
```
ob status
```
Investigate infrastructure targets:
```
ob targets
```
Stop the bridge when finished:
```
ob down hostA=hostB
```
OpsBridge handles the lifecycle so operators can focus on solving the problem.
---
# The Philosophy Behind OpsBridge
Infrastructure teams succeed or fail based on how effectively they bridge the gaps between:
**the declared system**
and
**the experienced system**
and
**the needed system**
DevOps describes how systems should work.
Operations deals with how systems actually behave.
OpsBridge exists to make that gap manageable.
---
# OpsBridge in One Sentence
**OpsBridge is a lightweight operations access layer that helps humans and automation agents investigate, repair and improve live infrastructure.**
xxx

388
wiki/OpsBridgeFrs.md Normal file
View File

@@ -0,0 +1,388 @@
OpsBridgeFrs
*Functional requirements specification for OpsBridge*
# OpsBridge Functional Requirements Specification
*Operations Access Bridges for Humans and Automation Agents*
Version: **0.1**
Status: Draft
Date: **2026-03-11**
---
# 1. Definition
The **OpsBridge Functional Requirements Specification (FRS)** defines the externally observable behaviors and capabilities that the OpsBridge system must provide in order to support controlled operational access bridges between infrastructure components.
OpsBridge enables human operators and automation agents to establish, inspect, and manage temporary infrastructure access paths, typically realized through secure connectivity mechanisms such as reverse SSH tunnels.
This specification describes **system behavior from the perspective of users, external systems, and observable outputs**, without prescribing implementation methods or internal system design.
The FRS provides the functional contract that guides system design, development, verification, and operational validation.
---
# 2. Context
OpsBridge operates within infrastructure environments where controlled access between systems must be established dynamically for operational purposes such as diagnostics, maintenance, and remediation.
These environments may involve interactions between:
* human operators
* automation agents
* remote execution environments
* infrastructure control services
* identity management systems
The FRS translates the product intent defined in the OpsBridge PRD into **precise functional expectations** that describe how the system must behave when interacting with users, external services, and infrastructure components.
Within the system documentation hierarchy:
* **PRD** defines the product intent and scope
* **FRS** defines externally observable system behavior
* **design specifications** describe the internal architecture that realizes those behaviors
---
# 3. Core Concepts
## Bridge
A **Bridge** represents a controlled operational access path between two infrastructure contexts.
The bridge enables connectivity between:
* a remote host environment
* a local service or endpoint
Bridges are created, monitored, and terminated through OpsBridge system functions.
---
## Actor
An **Actor** represents an entity initiating a bridge operation.
Actors may include:
* human operators
* automation agents
* automated maintenance systems
Actor identity is used for operations attribution and auditability.
---
## Target
A **Target** represents an infrastructure component that can be accessed via a bridge.
Targets may include:
* hosts
* services
* containers
* Kubernetes workloads
* operations control interfaces
Targets provide a structured orientation model for infrastructure access.
---
## Bridge State
A **Bridge State** represents the externally observable operational status of a bridge.
Examples include:
* stopped
* starting
* connected
* degraded
* failed
Bridge state information must be visible to users and external systems.
---
## Bridge Lifecycle Event
A **Bridge Lifecycle Event** represents a state transition or operational occurrence related to a bridge.
Examples include:
* bridge creation
* bridge connection established
* bridge disconnection
* health check failure
Lifecycle events must be observable through system outputs such as logs or status queries.
---
# 4. Scope and Non-Scope
## In Scope
This specification defines functional requirements for:
* creation and termination of bridges
* inspection of bridge state and lifecycle
* actor attribution for bridge operations
* health monitoring of bridged services
* visibility of reachable infrastructure targets
* interaction with external identity systems
* generation of operational audit information
The FRS focuses on **externally observable system behavior**.
---
## Out of Scope
The following aspects are intentionally excluded from this specification:
* technical implementation details
* internal system architecture
* specific algorithms or process models
* command-line interface layout or formatting
* performance or scalability characteristics unless functionally expressed
* security mechanisms beyond observable behavior
These aspects are defined in design and architecture specifications.
---
# 5. Functional Requirements
The following sections define the functional behavior required from the OpsBridge system.
Requirement statements are written in a declarative form suitable for verification.
---
## 5.1 Bridge Creation
### FR-1 — Bridge Initiation
The system shall allow an actor to initiate the creation of a bridge using a defined bridge identifier.
### FR-2 — Bridge Configuration Retrieval
Upon initiation of a bridge, the system shall retrieve the configuration associated with the specified bridge identifier.
### FR-3 — Bridge Establishment
The system shall establish an operational access bridge according to the retrieved configuration.
### FR-4 — Bridge State Notification
Upon successful establishment of a bridge, the system shall report the bridge state as **connected**.
---
## 5.2 Bridge Termination
### FR-5 — Bridge Termination Request
The system shall allow an actor to terminate an active bridge.
### FR-6 — Bridge Shutdown
Upon termination request, the system shall stop the active bridge.
### FR-7 — State Update After Termination
After termination, the system shall update the bridge state to **stopped**.
---
## 5.3 Bridge Restart
### FR-8 — Bridge Restart Request
The system shall allow an actor to request the restart of a bridge.
### FR-9 — Restart Execution
Upon receiving a restart request, the system shall terminate the active bridge and initiate a new bridge using the existing configuration.
---
## 5.4 Bridge Status Inspection
### FR-10 — Bridge Status Query
The system shall allow actors to query the operational status of bridges.
### FR-11 — Status Reporting
For each bridge, the system shall report:
* bridge identifier
* current bridge state
* associated actor
* remote host
* uptime or connection duration if available
---
## 5.5 Bridge Lifecycle Monitoring
### FR-12 — Disconnection Detection
The system shall detect when an established bridge becomes disconnected.
### FR-13 — Automatic Reconnection
If a bridge disconnects unexpectedly, the system shall attempt to re-establish the bridge according to the bridge configuration.
### FR-14 — State Reporting During Reconnection
During reconnection attempts, the system shall report the bridge state as **reconnecting** or equivalent.
---
## 5.6 Health Monitoring
### FR-15 — Health Check Execution
The system shall support optional health checks associated with a bridge.
### FR-16 — Health Status Reporting
The system shall report the result of health checks associated with a bridge.
### FR-17 — Degraded State
If a health check indicates failure while the bridge remains connected, the system shall report the bridge state as **degraded**.
---
## 5.7 Actor Attribution
### FR-18 — Actor Identification
The system shall associate each bridge with a defined actor.
### FR-19 — Actor Visibility
The system shall include actor identification information in bridge status reports.
### FR-20 — Actor Attribution in Events
The system shall include actor identity information in operations event records.
---
## 5.8 Infrastructure Target Discovery
### FR-21 — Target Catalog Query
The system shall allow actors to retrieve a list of defined infrastructure targets.
### FR-22 — Target Reachability Inspection
The system shall allow actors to inspect which bridges provide access to a given target.
### FR-23 — Infrastructure Orientation
The system shall provide a representation of infrastructure targets and their reachable access paths.
---
## 5.9 Audit Logging
### FR-24 — Lifecycle Event Logging
The system shall record lifecycle events related to bridges.
### FR-25 — Actor Attribution in Logs
Audit records shall include actor identity information associated with bridge operations.
### FR-26 — Operations Event Visibility
Operations events shall be retrievable by actors for inspection.
---
## 5.10 Identity Integration
### FR-27 — Identity Provider Interaction
The system shall support interaction with external identity systems to obtain credentials required for bridge establishment.
### FR-28 — Credential Use
The system shall use credentials obtained from external identity systems when establishing bridges.
### FR-29 — Identity Attribution
The system shall associate the identity of actors provided by external identity systems with bridge lifecycle events.
External identity systems may include:
* privacyIDEA
---
# 6. Functional Constraints
The following constraints influence system behavior.
### FC-1 — Configuration Dependency
Bridge operations depend on the existence of valid bridge configuration entries.
### FC-2 — External Connectivity
Bridge establishment requires network connectivity to the remote host defined in the configuration.
### FC-3 — Credential Availability
Bridge establishment requires valid credentials available through the configured identity integration mechanism.
---
# 7. Traceability
Each functional requirement defined in this document traces back to the product intent defined in the OpsBridge PRD.
Primary traceability relationships include:
| PRD Concept | FRS Requirement Group |
| -------------------------- | --------------------- |
| Operations Access Bridges | FR-1 to FR-14 |
| Actor Attribution | FR-18 to FR-20 |
| Infrastructure Orientation | FR-21 to FR-23 |
| Operations Observability | FR-10 to FR-17 |
| Identity Integration | FR-27 to FR-29 |
This traceability enables downstream artifacts such as:
* design specifications
* system tests
* acceptance criteria
* validation procedures
to map back to the originating product requirements.
---
# 8. Related Concepts
The OpsBridge Functional Requirements Specification relates to several adjacent artifacts.
* **Product Requirements Document (PRD)** Defines product intent and scope.
* **Non-Functional Requirements (NFR)** Define performance, reliability, and security expectations.
* **System Design Specification (SDS)** Describes the architecture used to implement the defined functions.
* **Use Case Specifications** Provide scenario-level interaction descriptions for system behavior.
Together these artifacts form a layered documentation structure supporting the full system lifecycle.
xxx

322
wiki/OpsBridgePrd.md Normal file
View File

@@ -0,0 +1,322 @@
OpsBridgePrd
*Product requirements specification for OpsBridge*
# OpsBridge Product Requirements Document
*Operations Access Bridges for Humans and Automation Agents*
Version: **0.1**
Status: **Draft**
Date: **2026-03-11**
---
# 1. Definition
**OpsBridge** is a lightweight IT-operations infrastructure tool that establishes **controlled access bridges between systems** in order to support human operators and automation agents performing diagnostics, maintenance, and remediation on live infrastructure.
An access bridge typically manifests as a **temporary reverse SSH connectivity path** that allows a remote system to reach a local service or control plane component.
OpsBridge provides a **structured and observable orchestration layer for such bridges**, enabling operators and automated agents to create, inspect, and terminate operational access paths while maintaining clear auditability and integration with external identity systems.
The product addresses the gap between:
* ad-hoc SSH usage
* developer tunneling utilities
* heavy enterprise infrastructure access platforms
by providing a **minimal operations coordination layer** specifically designed for infrastructure maintenance workflows.
---
# 2. Context
Modern infrastructure environments increasingly combine **human operations with automated maintenance systems**, including AI-assisted diagnostics and remediation agents.
These environments require **temporary and well-scoped access paths** between systems for activities such as:
* troubleshooting live services
* inspecting runtime environments
* retrieving diagnostic data
* applying remediation commands
Existing approaches typically rely on:
* manual SSH commands
* ad-hoc scripts
* VPN access
* full network overlays
* enterprise access gateways
Each of these approaches introduces trade-offs such as excessive operations scope, poor observability, or high infrastructure overhead.
OpsBridge operates as a **boundary artifact between operational intent and infrastructure connectivity**, providing a structured layer that mediates controlled access paths while remaining compatible with existing identity systems, infrastructure platforms, and operational tooling.
Architecturally, OpsBridge sits between:
* operations automation environments
* identity and credential management systems
* infrastructure access mechanisms such as SSH
The tool is intended for environments where **controlled and observable infrastructure access is required without introducing large additional platforms**.
---
# 3. Core Concepts
## Operations Access Bridge
An **Operations Access Bridge** is a temporary and controlled connectivity path that allows one infrastructure component to access a service or control endpoint hosted by another component.
The bridge concept focuses on **operational intent rather than networking technology**.
In most cases the bridge is realized through a reverse SSH tunnel.
---
## Actors
An **Actor** represents an entity initiating operational access.
Actors may include:
* human operators
* automation agents
* AI-driven remediation systems
* scheduled maintenance processes
Actors exist primarily for **auditability and identity integration**.
---
## Targets
A **Target** represents an infrastructure component that can be reached through a bridge.
Targets may include:
* physical hosts
* virtual machines
* containers
* Kubernetes pods
* service endpoints
* operations control planes
Targets serve as an **orientation mechanism** that helps operators and automation systems understand available infrastructure access paths.
---
## Bridge Lifecycle
A bridge passes through lifecycle states including:
* creation
* connection establishment
* operational availability
* disconnection
* termination
Lifecycle management is central to maintaining **reliable and observable operations access paths**.
---
## Identity Integration
OpsBridge integrates with external identity systems that govern authentication, authorization, and credential issuance.
Identity integration ensures that operations access events can be attributed to specific actors without requiring OpsBridge to act as an identity management system.
---
# 4. Scope and Non-Scope
## In Scope
OpsBridge provides the following capabilities:
* Creation and management of operations access bridges
* Visibility into active bridges and their operational status
* Identification of actors initiating access bridges
* Basic infrastructure orientation through reachable targets
* Structured operational audit logging
* Integration with external identity systems for authentication and credential management
* Support for progressive operational maturity from ad-hoc usage to centrally governed environments
OpsBridge aims to enable **reliable, observable, and automation-friendly infrastructure access orchestration**.
---
## Out of Scope
OpsBridge intentionally avoids responsibilities belonging to adjacent system categories.
OpsBridge does not:
* implement identity management or user provisioning
* provide VPN or overlay network functionality
* replace enterprise infrastructure access gateways
* act as a bastion host platform
* manage infrastructure configuration or orchestration
* implement policy engines or access governance systems
These capabilities remain the responsibility of external systems such as identity providers, infrastructure platforms, and security tooling.
---
# 5. Practical Implications
Adopting OpsBridge introduces a structured operations layer that replaces ad-hoc SSH workflows with a **consistent and observable access mechanism**.
This has several implications.
### Improved operational clarity
Operators gain a clear overview of active infrastructure access paths and the actors responsible for initiating them.
### Support for automation-driven operations
Automation systems and AI diagnostic agents can interact with infrastructure using reproducible access bridges rather than custom scripts.
### Incremental security adoption
OpsBridge supports environments ranging from minimal ad-hoc infrastructure setups to centrally governed production systems.
Organizations can adopt the tool without requiring immediate deployment of complex identity infrastructure.
### Improved auditability
Operations access events become traceable and attributable, improving incident analysis and compliance capabilities.
However, the introduction of an additional operations tool also requires:
* operational discipline in maintaining configuration
* integration with existing infrastructure management practices
* awareness of bridge lifecycle management in automated workflows
---
# 6. External Dependencies and Assumptions
OpsBridge assumes the existence of several external components.
### Secure infrastructure access mechanism
OpsBridge relies on a secure underlying access mechanism such as **SSH** to establish operations bridges.
### Identity providers
Identity and credential management may be provided by external systems such as:
* privacyIDEA
* OpenSSH certificate authorities
* enterprise identity platforms
OpsBridge interacts with these systems but does not replicate their functionality.
### Operations environments
OpsBridge assumes execution within infrastructure environments that support command-line tools and secure remote connectivity.
Typical environments include:
* Linux systems
* macOS workstations
* development environments using WSL2
---
# 7. Success Criteria
The success of OpsBridge can be evaluated using several outcome-oriented criteria.
## Operations effectiveness
Operators and automation agents can establish operational access bridges quickly and reliably without requiring manual SSH command construction.
## Observability
Active access bridges and their actors are visible through consistent operations inspection commands and audit logs.
## Integration capability
OpsBridge integrates smoothly with identity systems, infrastructure platforms, and operations automation environments.
## Adoption flexibility
The tool can be used effectively in both:
* small infrastructure setups with minimal governance
* larger environments with centralized identity management and auditing requirements.
## Reduced operational friction
Teams using OpsBridge experience reduced complexity compared to ad-hoc SSH tunneling or deploying large access platforms for operational tasks.
---
# 8. Related Concepts
OpsBridge relates to several adjacent concepts and tool categories.
### SSH Tunnel Management
Tools such as *autossh* maintain persistent SSH tunnels but lack operational inventory and identity integration.
### Developer Tunneling Tools
Tools such as *ngrok* focus on exposing local services for development workflows rather than infrastructure maintenance.
### Infrastructure Access Platforms
Enterprise tools such as *Teleport* provide identity-centric infrastructure access but operate at a significantly larger architectural scope.
### Overlay Networks
Systems such as *Tailscale* create persistent private networks rather than temporary operational bridges.
OpsBridge occupies a distinct position focused on **temporary operations access paths for infrastructure maintenance**.
---
# 9. Product Variants and Evolution
OpsBridge supports progressive adoption through increasing operations maturity levels.
### Level 0 — Ad-hoc infrastructure environments
Minimal configuration with unmanaged SSH keys.
### Level 1 — Structured operations usage
Actors and bridges are clearly identified and logged.
### Level 2 — Identity-integrated environments
Authentication and credential management are handled by external identity providers.
### Level 3 — Governed production environments
Short-lived credentials, centralized auditing, and policy oversight are integrated through external systems.
This progression allows organizations to adopt OpsBridge without requiring immediate infrastructure changes.
---
# 10. Relationship to Downstream Artifacts
The OpsBridge PRD acts as the **product intent anchor** for subsequent documentation.
Derived artifacts may include:
* Functional Requirements Specification (FRS)
* Technical Architecture Specification
* Security Integration Specifications
* Implementation design documents
* Architecture Decision Records
These artifacts translate the product intent defined here into concrete system behavior and implementation strategies.
xxx

View File

@@ -0,0 +1,538 @@
OpsCatalogSpecification
*IT Operations Knowledge Repository*
Below is a **structured OpsCatalog specification** designed as an **extension to OpsBridge**.
It includes:
1. **Why / How / What introduction**
2. **PRD for OpsCatalog**
3. **FRS for OpsCatalog**
4. **Schemas**
5. **Repository structure**
6. **Appendices with operational notes**
---
# OpsCatalog Specification
*Operations Knowledge Repository for Infrastructure Operations*
Version: **0.1**
Status: Draft
Date: 2026-03-11
---
# Introduction
## Why
Modern infrastructure teams operate with two complementary models of reality.
**DevOps Model — Declared Infrastructure**
Infrastructure-as-code systems describe the desired state of systems:
* Terraform
* Kubernetes manifests
* Helm charts
* GitOps pipelines
These systems encode **how infrastructure should behave**.
However, real systems rarely match the declared state perfectly.
Operations teams must deal with:
* incidents
* degraded services
* bottlenecks
* debugging environments
* manual recovery actions
* temporary workarounds
* unexpected interactions
This produces a second model.
**Operations Model — Experienced Infrastructure**
This model captures:
* how operators actually access systems
* which debugging paths exist
* where bottlenecks occur
* which entry points are used for remediation
* which bridges exist between infrastructure components
Most organizations lack a formal system for capturing this operational knowledge.
OpsCatalog exists to address this gap.
---
## How
OpsCatalog introduces a **structured repository for operations infrastructure knowledge**.
The repository is typically maintained in **Git** and contains structured definitions of:
* operations domains
* infrastructure targets
* operations access bridges
* actor classes
* operations annotations
OpsBridge consumes this catalog to:
* resolve bridges
* orient operators
* guide automation agents
* provide operations context
Git provides several properties that make it suitable for this purpose:
* version history
* collaborative editing
* review workflows
* diffability for humans and agents
* narrative context through commit messages
OpsCatalog stores **experienced operations knowledge**, not runtime state.
---
## What
OpsCatalog defines a **shared operations map of infrastructure**.
It captures:
*Operations Domains*
Logical spaces representing operations infrastructure areas.
Examples:
* production clusters
* staging environments
* development infrastructure
* incident analysis sandboxes
*Targets*
Infrastructure components relevant to operations.
Examples:
* hosts
* services
* containers
* Kubernetes resources
* debugging entry points
*Bridges*
Operations access paths between systems.
Examples:
* SSH reverse bridges
* debugging entry tunnels
* maintenance access paths
*Operations Notes*
Structured annotations describing:
* debugging procedures
* common incidents
* bottlenecks
* known workarounds
* operations entry points
Together these elements provide a **living operations topology**.
---
# Part 1 — Product Requirements Document (PRD)
## 1. Definition
OpsCatalog is a structured repository that defines **operations knowledge about infrastructure environments**, including domains, targets, bridges, and operations annotations.
It provides a shared operations map used by human operators and automation agents to understand how infrastructure is accessed and maintained in practice.
OpsCatalog complements infrastructure-as-code systems by capturing the **experienced operations topology** rather than the declared infrastructure state.
---
## 2. Context
OpsCatalog operates within environments that already use:
* infrastructure-as-code tools
* automated deployment systems
* identity management systems
* operations monitoring platforms
These systems define and monitor infrastructure but often fail to capture how operators interact with systems during incidents or maintenance.
OpsCatalog fills this gap by providing a **structured operations cognition layer**.
OpsBridge integrates with OpsCatalog to translate catalog definitions into actionable access bridges.
---
## 3. Core Concepts
### Operations Domain
A logical operational boundary representing a group of related infrastructure systems.
Domains help operators navigate complex environments.
---
### Target
An operationally relevant infrastructure component that may be inspected or accessed.
Targets represent entry points for diagnostics and maintenance.
---
### Bridge
A defined operations access path enabling connectivity between infrastructure contexts.
Bridges describe **how targets are accessed**.
---
### Actor Class
A category of operators or automation systems that may interact with infrastructure.
Examples:
* human operators
* remediation agents
* incident responders
---
### Operations Annotation
Structured knowledge describing operations behaviors, known issues, or debugging strategies.
---
## 4. Scope and Non-Scope
### In Scope
OpsCatalog defines:
* operations domains
* infrastructure targets
* operations bridges
* actor classifications
* operations annotations
* repository structure for catalog storage
---
### Out of Scope
OpsCatalog does not:
* manage infrastructure resources
* maintain runtime infrastructure state
* replace monitoring systems
* replace configuration management systems
* enforce security policies
* store credentials or secrets
These responsibilities remain with external systems.
---
## 5. Practical Implications
OpsCatalog provides several operations advantages.
### Shared operations knowledge
Teams maintain a common understanding of infrastructure access paths.
### Improved incident response
Operators can quickly locate operations entry points.
### Automation enablement
AI agents and automation systems gain structured knowledge about infrastructure navigation.
### Organizational resilience
Operations knowledge becomes versioned and reviewable rather than implicit.
However, maintaining the catalog requires:
* operations discipline
* periodic review
* integration with infrastructure evolution
---
## 6. External Dependencies
OpsCatalog assumes integration with several external systems.
Examples include:
* infrastructure-as-code platforms
* operations access tools such as OpsBridge
* identity systems such as privacyIDEA
* version control systems such as Git
---
## 7. Success Criteria
OpsCatalog is successful if it enables operators and automation agents to:
* locate relevant infrastructure targets quickly
* identify operations access paths
* understand operations context during incidents
* maintain shared operations knowledge across teams
---
# Part 2 — Functional Requirements Specification (FRS)
## 1. Domain Management
### FR-1 Domain Definition
The system shall allow definition of operations domains.
### FR-2 Domain Listing
The system shall allow retrieval of all defined domains.
### FR-3 Domain Inspection
The system shall allow inspection of a specific domain and its associated elements.
---
## 2. Target Management
### FR-4 Target Definition
The system shall allow definition of infrastructure targets within domains.
### FR-5 Target Query
The system shall allow retrieval of targets belonging to a domain.
### FR-6 Target Inspection
The system shall allow inspection of metadata associated with a target.
---
## 3. Bridge Definition
### FR-7 Bridge Definition
The system shall allow definition of operations bridges connecting infrastructure contexts.
### FR-8 Bridge Query
The system shall allow retrieval of bridges associated with a target or domain.
### FR-9 Bridge Inspection
The system shall allow inspection of bridge metadata.
---
## 4. Actor Classification
### FR-10 Actor Class Definition
The system shall allow definition of actor classes.
### FR-11 Actor Attribution
The system shall allow bridges to reference actor classes.
---
## 5. Operational Annotations
### FR-12 Operational Notes
The system shall allow structured annotations associated with domains, targets, and bridges.
### FR-13 Annotation Retrieval
The system shall allow retrieval of annotations associated with infrastructure elements.
---
## 6. Repository Interaction
### FR-14 Catalog Retrieval
The system shall load catalog data from a repository structure.
### FR-15 Catalog Validation
The system shall validate the structure of catalog definitions.
---
# Schemas
Example schemas are expressed in YAML.
---
## Domain Schema
```yaml
type: domain
id: coulombcore
name: CoulombCore Infrastructure
description: Core infrastructure domain for operational services
environment: production
```
---
## Target Schema
```yaml
type: target
id: state-hub
domain: coulombcore
kind: service
description: Infrastructure state coordination service
reachable_via:
- state-hub-coulombcore
```
---
## Bridge Schema
```yaml
type: bridge
id: state-hub-coulombcore
domain: coulombcore
target: state-hub
description: Operations bridge for state hub diagnostics
access_method: ssh-reverse
```
---
## Actor Schema
```yaml
type: actor
id: agent.claude-remediator
class: automation
description: Automated remediation agent
```
---
# Repository Structure
Recommended repository layout:
```
opscatalog/
domains/
coulombcore/
domain.yaml
targets/
state-hub.yaml
api-server.yaml
bridges/
state-hub-coulombcore.yaml
docs/
overview.md
operations.md
actors/
human-operators.yaml
automation-agents.yaml
schemas/
domain.schema.yaml
target.schema.yaml
bridge.schema.yaml
```
This layout supports both human readability and machine parsing.
---
# Appendices
## Appendix A — Operations Notes
Operations notes provide context about real-world infrastructure behavior.
Examples include:
* known debugging entry points
* typical failure modes
* operational shortcuts
* historical incidents
* recommended inspection procedures
Operations notes may be written in structured markdown files stored alongside catalog entries.
---
## Appendix B — Catalog Maintenance Guidelines
Maintaining an effective OpsCatalog requires operational discipline.
Recommended practices include:
* review changes through pull requests
* annotate bridges with operational purpose
* update catalog entries after major infrastructure changes
* document common debugging procedures
* avoid storing secrets in catalog files
---
## Appendix C — Relationship to OpsBridge
OpsCatalog serves as a **knowledge source for OpsBridge**.
OpsBridge may consume catalog data to:
* resolve bridge identifiers
* display infrastructure orientation
* assist operators in establishing bridges
* provide contextual operational information
The catalog does not control runtime behavior but provides **structured operations intent**.
xxx