Files
ops-bridge/wiki/OpsCatalogSpecification.md
2026-03-11 21:29:59 +01:00

11 KiB

OpsCatalogSpecification

IT Operations Knowledge Repository

Below is a structured OpsCatalog specification designed as an extension to OpsBridge.

It includes:

  1. Why / How / What introduction
  2. PRD for OpsCatalog
  3. FRS for OpsCatalog
  4. Schemas
  5. Repository structure
  6. Appendices with operational notes

OpsCatalog Specification

Operations Knowledge Repository for Infrastructure Operations

Version: 0.1 Status: Draft Date: 2026-03-11


Introduction

Why

Modern infrastructure teams operate with two complementary models of reality.

DevOps Model — Declared Infrastructure

Infrastructure-as-code systems describe the desired state of systems:

  • Terraform
  • Kubernetes manifests
  • Helm charts
  • GitOps pipelines

These systems encode how infrastructure should behave.

However, real systems rarely match the declared state perfectly.

Operations teams must deal with:

  • incidents
  • degraded services
  • bottlenecks
  • debugging environments
  • manual recovery actions
  • temporary workarounds
  • unexpected interactions

This produces a second model.

Operations Model — Experienced Infrastructure

This model captures:

  • how operators actually access systems
  • which debugging paths exist
  • where bottlenecks occur
  • which entry points are used for remediation
  • which bridges exist between infrastructure components

Most organizations lack a formal system for capturing this operational knowledge.

OpsCatalog exists to address this gap.


How

OpsCatalog introduces a structured repository for operations infrastructure knowledge.

The repository is typically maintained in Git and contains structured definitions of:

  • operations domains
  • infrastructure targets
  • operations access bridges
  • actor classes
  • operations annotations

OpsBridge consumes this catalog to:

  • resolve bridges
  • orient operators
  • guide automation agents
  • provide operations context

Git provides several properties that make it suitable for this purpose:

  • version history
  • collaborative editing
  • review workflows
  • diffability for humans and agents
  • narrative context through commit messages

OpsCatalog stores experienced operations knowledge, not runtime state.


What

OpsCatalog defines a shared operations map of infrastructure.

It captures:

Operations Domains

Logical spaces representing operations infrastructure areas.

Examples:

  • production clusters
  • staging environments
  • development infrastructure
  • incident analysis sandboxes

Targets

Infrastructure components relevant to operations.

Examples:

  • hosts
  • services
  • containers
  • Kubernetes resources
  • debugging entry points

Bridges

Operations access paths between systems.

Examples:

  • SSH reverse bridges
  • debugging entry tunnels
  • maintenance access paths

Operations Notes

Structured annotations describing:

  • debugging procedures
  • common incidents
  • bottlenecks
  • known workarounds
  • operations entry points

Together these elements provide a living operations topology.


Part 1 — Product Requirements Document (PRD)

1. Definition

OpsCatalog is a structured repository that defines operations knowledge about infrastructure environments, including domains, targets, bridges, and operations annotations.

It provides a shared operations map used by human operators and automation agents to understand how infrastructure is accessed and maintained in practice.

OpsCatalog complements infrastructure-as-code systems by capturing the experienced operations topology rather than the declared infrastructure state.


2. Context

OpsCatalog operates within environments that already use:

  • infrastructure-as-code tools
  • automated deployment systems
  • identity management systems
  • operations monitoring platforms

These systems define and monitor infrastructure but often fail to capture how operators interact with systems during incidents or maintenance.

OpsCatalog fills this gap by providing a structured operations cognition layer.

OpsBridge integrates with OpsCatalog to translate catalog definitions into actionable access bridges.


3. Core Concepts

Operations Domain

A logical operational boundary representing a group of related infrastructure systems.

Domains help operators navigate complex environments.


Target

An operationally relevant infrastructure component that may be inspected or accessed.

Targets represent entry points for diagnostics and maintenance.


Bridge

A defined operations access path enabling connectivity between infrastructure contexts.

Bridges describe how targets are accessed.


Actor Class

A category of operators or automation systems that may interact with infrastructure.

Examples:

  • human operators
  • remediation agents
  • incident responders

Operations Annotation

Structured knowledge describing operations behaviors, known issues, or debugging strategies.


4. Scope and Non-Scope

In Scope

OpsCatalog defines:

  • operations domains
  • infrastructure targets
  • operations bridges
  • actor classifications
  • operations annotations
  • repository structure for catalog storage

Out of Scope

OpsCatalog does not:

  • manage infrastructure resources
  • maintain runtime infrastructure state
  • replace monitoring systems
  • replace configuration management systems
  • enforce security policies
  • store credentials or secrets

These responsibilities remain with external systems.


5. Practical Implications

OpsCatalog provides several operations advantages.

Shared operations knowledge

Teams maintain a common understanding of infrastructure access paths.

Improved incident response

Operators can quickly locate operations entry points.

Automation enablement

AI agents and automation systems gain structured knowledge about infrastructure navigation.

Organizational resilience

Operations knowledge becomes versioned and reviewable rather than implicit.

However, maintaining the catalog requires:

  • operations discipline
  • periodic review
  • integration with infrastructure evolution

6. External Dependencies

OpsCatalog assumes integration with several external systems.

Examples include:

  • infrastructure-as-code platforms
  • operations access tools such as OpsBridge
  • identity systems such as privacyIDEA
  • version control systems such as Git

7. Success Criteria

OpsCatalog is successful if it enables operators and automation agents to:

  • locate relevant infrastructure targets quickly
  • identify operations access paths
  • understand operations context during incidents
  • maintain shared operations knowledge across teams

Part 2 — Functional Requirements Specification (FRS)

1. Domain Management

FR-1 Domain Definition

The system shall allow definition of operations domains.

FR-2 Domain Listing

The system shall allow retrieval of all defined domains.

FR-3 Domain Inspection

The system shall allow inspection of a specific domain and its associated elements.


2. Target Management

FR-4 Target Definition

The system shall allow definition of infrastructure targets within domains.

FR-5 Target Query

The system shall allow retrieval of targets belonging to a domain.

FR-6 Target Inspection

The system shall allow inspection of metadata associated with a target.


3. Bridge Definition

FR-7 Bridge Definition

The system shall allow definition of operations bridges connecting infrastructure contexts.

FR-8 Bridge Query

The system shall allow retrieval of bridges associated with a target or domain.

FR-9 Bridge Inspection

The system shall allow inspection of bridge metadata.


4. Actor Classification

FR-10 Actor Class Definition

The system shall allow definition of actor classes.

FR-11 Actor Attribution

The system shall allow bridges to reference actor classes.


5. Operational Annotations

FR-12 Operational Notes

The system shall allow structured annotations associated with domains, targets, and bridges.

FR-13 Annotation Retrieval

The system shall allow retrieval of annotations associated with infrastructure elements.


6. Repository Interaction

FR-14 Catalog Retrieval

The system shall load catalog data from a repository structure.

FR-15 Catalog Validation

The system shall validate the structure of catalog definitions.


Schemas

Example schemas are expressed in YAML.


Domain Schema

type: domain
id: coulombcore
name: CoulombCore Infrastructure
description: Core infrastructure domain for operational services
environment: production

Target Schema

type: target
id: state-hub
domain: coulombcore
kind: service
description: Infrastructure state coordination service
reachable_via:
  - state-hub-coulombcore

Bridge Schema

type: bridge
id: state-hub-coulombcore
domain: coulombcore
target: state-hub
description: Operations bridge for state hub diagnostics
access_method: ssh-reverse

Actor Schema

type: actor
id: agent.claude-remediator
class: automation
description: Automated remediation agent

Repository Structure

Recommended repository layout:

opscatalog/
  domains/
    coulombcore/
      domain.yaml

      targets/
        state-hub.yaml
        api-server.yaml

      bridges/
        state-hub-coulombcore.yaml

      docs/
        overview.md
        operations.md

  actors/
    human-operators.yaml
    automation-agents.yaml

  schemas/
    domain.schema.yaml
    target.schema.yaml
    bridge.schema.yaml

This layout supports both human readability and machine parsing.


Appendices

Appendix A — Operations Notes

Operations notes provide context about real-world infrastructure behavior.

Examples include:

  • known debugging entry points
  • typical failure modes
  • operational shortcuts
  • historical incidents
  • recommended inspection procedures

Operations notes may be written in structured markdown files stored alongside catalog entries.


Appendix B — Catalog Maintenance Guidelines

Maintaining an effective OpsCatalog requires operational discipline.

Recommended practices include:

  • review changes through pull requests
  • annotate bridges with operational purpose
  • update catalog entries after major infrastructure changes
  • document common debugging procedures
  • avoid storing secrets in catalog files

Appendix C — Relationship to OpsBridge

OpsCatalog serves as a knowledge source for OpsBridge.

OpsBridge may consume catalog data to:

  • resolve bridge identifiers
  • display infrastructure orientation
  • assist operators in establishing bridges
  • provide contextual operational information

The catalog does not control runtime behavior but provides structured operations intent.

xxx