Files

tegwick 1ed7198ce3 Initial seeding of models, standards

2026-05-23 00:55:01 +02:00

44 KiB

Executable File

Raw Blame History

InfoTechCanon Information Space Model

Short Name: ITC-INFOSPACE
Document Status: Seed Standard Release Candidate 1
Version: RC1-seed
Date: 2026-05-23
Repository Context: info-tech-canon
Document Type: InfoTechCanon Domain Standard
Intended Audience: Knowledge-system builders, markdown-infospace maintainers, standards authors, AI-agent tool builders, documentation architects, information architects, ontology/taxonomy maintainers, software architects, platform builders, and retrieval-system designers.

1. Purpose

The InfoTechCanon Information Space Model defines a canonical seed model for representing markdown-first, human-readable, machine-retrievable, provenance-aware, interconnected information spaces.

It exists to provide the structural and semantic foundation for using InfoTechCanon as an evolving reference body for humans, agents, tools, and services.

This standard owns the concepts required to make a body of knowledge:

navigable,
retrievable,
reusable,
linkable,
citable,
chunkable,
versionable,
mappable,
provenance-aware,
profile-aware,
and suitable for both human reading and agentic use.

It provides a canonical vocabulary for:

information spaces,
knowledge artifacts,
markdown documents,
concept pages,
standard documents,
chunks,
sections,
anchors,
identifiers,
links,
backlinks,
citations,
references,
indexes,
summaries,
agent briefs,
retrieval units,
embeddings,
metadata,
provenance,
mappings,
assimilation records,
views,
navigation structures,
and knowledge-quality signals.

2. Position in InfoTechCanon

The Information Space Model is a domain standard within InfoTechCanon.

It should serve as the structural substrate for how all other standards are stored, retrieved, navigated, linked, mapped, and reused.

InfoTechCanon
├── InfoTechCanonCore
├── InfoTechCanonInformationSpaceModel  <-- this standard
├── InfoTechCanonLandscapeModel
├── InfoTechCanonOrganizationModel
├── InfoTechCanonGovernanceModel
├── InfoTechCanonTaskModel
├── InfoTechCanonTaggingStandard
├── InfoTechCanonAccessControlModel
├── InfoTechCanonSecurityModel
├── InfoTechCanonDataModel
├── InfoTechCanonDevSecOpsModel
├── InfoTechCanonNetworkModel
├── InfoTechCanonObservabilityModel
├── InfoTechCanonPatternLanguage
└── Application Profiles

The dependency role is:

Domain standards define meaning.
Information Space defines how meaning is packaged, linked, indexed, retrieved, cited, and reused.

3. Boundary with Adjacent Standards

3.1 Boundary with Core

InfoTechCanonCore should own generic canon mechanisms:

Concept
Standard
Pattern
Profile
Mapping
Assimilation
Versioning
Conformance
Canonical Owner

The Information Space Model owns the storage, retrieval, navigation, chunking, and documentation structures used to operationalize those mechanisms.

3.2 Boundary with Tagging

The Tagging Standard owns tag identity, schemes, namespaces, assignments, and validation.

The Information Space Model uses tags for navigation and retrieval but does not define tag semantics.

3.3 Boundary with Data

The Data Model owns datasets, schemas, data products, lineage, data contracts, and data quality.

The Information Space Model owns knowledge artifacts, documents, chunks, indexes, and retrieval units.

A corpus of Markdown files may be treated as data by the Data Model, but the information-space semantics are owned here.

3.4 Boundary with Governance

Governance owns policies, controls, decisions, exceptions, evidence, and assurance.

The Information Space Model owns how governance documents, evidence references, citations, and versioned documentation artifacts are structured and retrieved.

3.5 Boundary with DevSecOps

DevSecOps owns source repositories, commits, pipelines, releases, deployments, SBOMs, and attestations.

The Information Space Model may be implemented in Git and linked to DevSecOps records, but it owns the knowledge-space structure.

3.6 Boundary with Observability

Observability owns telemetry, signals, metrics, logs, traces, alerts, and operational evidence.

The Information Space Model owns knowledge artifacts and retrieval structures, not runtime telemetry.

4. Research Basis and External Alignment

This seed standard draws on several knowledge organization and metadata traditions.

4.1 SKOS

SKOS defines a common data model for sharing and linking knowledge organization systems such as thesauri, taxonomies, classification schemes, and subject-heading systems. It provides useful concepts such as concept schemes, preferred labels, alternative labels, broader/narrower relations, related relations, and mapping relations.

4.2 FAIR Principles

The FAIR principles emphasize that digital assets should be Findable, Accessible, Interoperable, and Reusable. They are especially relevant because InfoTechCanon must be useful to both humans and machines.

4.3 PROV-O

PROV-O models provenance through entities, activities, and agents. This is central for tracking how knowledge artifacts, mappings, assimilations, and standards evolved.

4.4 Dublin Core and Application Profiles

Dublin Core and the Singapore Framework for application profiles distinguish reusable metadata vocabularies from application-specific profiles. This directly supports InfoTechCanon’s distinction between canonical concepts and concrete profiles.

4.5 Zettelkasten, Wikis, and Hypertext

Zettelkasten and wiki traditions emphasize durable notes, links, backlinks, local context, reuse, and emergent structure. They are useful for the human side of markdown-first information spaces.

4.6 Documentation Systems and Static Site Generators

Modern documentation systems emphasize stable headings, cross-links, front matter, sidebars, indexes, search, versioned docs, and generated navigation. These are practical implementation targets.

4.7 Retrieval-Augmented Generation

RAG systems require chunking, metadata, embeddings, summaries, stable identifiers, source references, and retrieval-quality evaluation. The Information Space Model should make retrieval a first-class design concern rather than an afterthought.

5. Seed Standard Design Stance

This standard is a seed standard, not a complete CMS, ontology, or retrieval-engine specification.

It shall:

define canonical information-space semantics,
remain markdown-first,
support human navigation and agent retrieval,
support stable identifiers and anchors,
support chunking and retrieval units,
support citations, references, provenance, and versioning,
support indexes, views, summaries, and agent briefs,
support mappings and assimilation records,
map to external standards without becoming subordinate to them,
support future integration with markdown infobase tools and services.

6. Scope

6.1 In Scope

This standard covers canonical representation of:

information spaces,
knowledge bases,
infospaces,
repositories as knowledge spaces,
markdown documents,
concept pages,
standard documents,
pattern documents,
profile documents,
mapping documents,
assimilation reports,
decision records,
sections,
headings,
anchors,
stable identifiers,
chunks,
retrieval units,
summaries,
agent briefs,
front matter,
metadata records,
links,
backlinks,
citations,
references,
external references,
indexes,
navigation views,
document collections,
shard references,
source references,
provenance records,
version records,
quality signals,
embedding records,
retrieval queries,
retrieval results,
and reuse contexts.

6.2 Out of Scope

This standard does not fully define:

all semantic-web ontology modeling,
all RDF/OWL reasoning,
full CMS implementation,
full search-engine ranking,
all embedding algorithms,
full vector-database implementation,
all document authoring standards,
all bibliography formats,
all legal citation systems,
all software repository structures,
or every markdown dialect.

Those may be mapped, assimilated, profiled, or handled by adjacent standards.

7. Normative Language

The following terms are used normatively:

SHALL indicates a mandatory rule for conformance.
SHOULD indicates a recommended practice.
MAY indicates an optional capability.
MUST NOT indicates a prohibited practice.
SEED marks a concept defined provisionally here but open to later refinement.
EXTRACT marks a concept that may later move to a more specialized standard.

8. Core Principles

8.1 Markdown-First, Not Markdown-Only

The canonical working format SHOULD be Markdown with structured metadata, but the model SHOULD allow export to JSON, YAML, RDF, JSON-LD, graph databases, search indexes, and vector stores.

8.2 Human-Readable and Machine-Retrievable

Every important artifact SHOULD be readable by humans and retrievable by machines.

8.3 Stable Identity Is Mandatory for Reuse

Artifacts, sections, concepts, mappings, profiles, and retrieval units SHOULD have stable identifiers.

8.4 Chunking Is a Design Concern

Documents SHOULD be structured so that retrieval chunks preserve meaning, context, and source traceability.

8.5 Links Are First-Class

Links, backlinks, references, citations, and mappings SHOULD be explicit and queryable where possible.

8.6 Provenance Is First-Class

Knowledge artifacts SHOULD preserve source, author, generator, change activity, review state, and influence where meaningful.

8.7 Views Are Not the Model

Indexes, navigation pages, diagrams, generated views, and dashboards are views over the information space, not the underlying knowledge itself.

8.8 Retrieval Must Be Evaluated

Useful information spaces SHOULD support retrieval-quality checks, stale-content detection, broken-link detection, and duplicate/conflict detection.

8.9 External Standards Are Mapped, Not Obeyed

The Information Space Model MAY map to SKOS, FAIR, PROV-O, Dublin Core, DCAT, Markdown, DITA, RDF, JSON-LD, static-site generators, and RAG tooling patterns.

It MUST NOT subordinate its internal semantics to any single external model.

9. Canonical Seed Metadata

Every information-space artifact SHOULD support structured metadata.

Recommended front matter:

---
id: itc-infospace:KnowledgeArtifact
type: concept
standard: InfoTechCanonInformationSpaceModel
standard_version: RC1-seed
status: candidate
canonical_owner: InfoTechCanonInformationSpaceModel
preferred_label: Knowledge Artifact
related:
  - itc-infospace:InformationSpace
  - itc-infospace:Document
  - itc-infospace:RetrievalUnit
  - itc-infospace:ProvenanceRecord
mappings:
  - itc-map:knowledge-artifact-to-prov-entity
---

Recommended artifact statuses:

idea
draft
candidate
release-candidate
adopted
stable
deprecated
retired

10. Root Information Space Taxonomy

InformationSpaceEntity
├── SpaceEntity
│   ├── InformationSpace
│   ├── KnowledgeBase
│   ├── Infospace
│   ├── RepositorySpace
│   ├── Shard
│   ├── Collection
│   └── Corpus
├── ArtifactEntity
│   ├── KnowledgeArtifact
│   ├── Document
│   ├── MarkdownDocument
│   ├── ConceptPage
│   ├── StandardDocument
│   ├── PatternDocument
│   ├── ProfileDocument
│   ├── MappingDocument
│   ├── AssimilationReport
│   ├── DecisionRecord
│   └── AgentBrief
├── StructureEntity
│   ├── Section
│   ├── Heading
│   ├── Anchor
│   ├── Block
│   ├── Chunk
│   ├── RetrievalUnit
│   ├── Summary
│   └── Excerpt
├── LinkEntity
│   ├── Link
│   ├── Backlink
│   ├── CrossReference
│   ├── Citation
│   ├── SourceReference
│   ├── ExternalReference
│   ├── MappingReference
│   └── DependencyReference
├── MetadataEntity
│   ├── FrontMatter
│   ├── MetadataRecord
│   ├── Identifier
│   ├── Namespace
│   ├── Label
│   ├── Alias
│   ├── Status
│   └── VersionRecord
├── RetrievalEntity
│   ├── Index
│   ├── SearchIndex
│   ├── VectorIndex
│   ├── EmbeddingRecord
│   ├── RetrievalQuery
│   ├── RetrievalResult
│   ├── RetrievalContext
│   └── RetrievalEvaluation
├── ProvenanceEntity
│   ├── ProvenanceRecord
│   ├── Source
│   ├── Activity
│   ├── AgentReference
│   ├── Generation
│   ├── Revision
│   ├── Influence
│   └── ReviewRecord
├── ViewEntity
│   ├── NavigationView
│   ├── TopicIndex
│   ├── ConceptIndex
│   ├── RelationshipIndex
│   ├── MapView
│   ├── GraphView
│   └── UsePath
└── QualityEntity
    ├── BrokenLink
    ├── DuplicateContent
    ├── StaleContent
    ├── ConflictingDefinition
    ├── MissingMetadata
    ├── LowRetrievalQuality
    └── OrphanArtifact

11. Core Concepts

11.1 InformationSpace

An InformationSpace is a bounded, navigable, retrievable, and evolving body of knowledge.

Examples:

InfoTechCanon repository
project wiki
standards library
markdown knowledge base
research corpus
documentation space
agent-readable context repository

11.2 KnowledgeBase

A KnowledgeBase is an information space organized to preserve, retrieve, and reuse knowledge.

11.3 Infospace

An Infospace is a structured knowledge environment optimized for navigation, recombination, retrieval, and reuse.

In InfoTechCanon, an infospace is expected to be markdown-first and machine-indexable.

11.4 RepositorySpace

A RepositorySpace is an information space backed by a source repository.

11.5 Shard

A Shard is an independently maintained portion of an information space that can be attached, federated, cached, or overlaid with other shards.

This concept supports shard-wiki-like federation.

11.6 Collection

A Collection is a curated group of knowledge artifacts.

11.7 Corpus

A Corpus is a body of documents or artifacts used for search, retrieval, analysis, or training-like reference.

11.8 KnowledgeArtifact

A KnowledgeArtifact is any identifiable artifact that carries reusable knowledge.

Examples:

standard document
concept page
pattern document
profile
mapping file
assimilation report
decision record
agent brief
example
schema
diagram

11.9 Document

A Document is a knowledge artifact primarily represented as ordered textual or structured content.

11.10 MarkdownDocument

A MarkdownDocument is a document represented in Markdown or a Markdown-compatible dialect.

11.11 ConceptPage

A ConceptPage is a document or section that defines and explains one canonical concept.

11.12 StandardDocument

A StandardDocument is a document defining a standard, its scope, concepts, relationships, validation rules, mappings, and profiles.

11.13 PatternDocument

A PatternDocument is a document describing a recurring problem, forces, solution, resulting context, variants, and related patterns.

11.14 ProfileDocument

A ProfileDocument is a document defining constraints and implementation guidance for a specific context.

11.15 MappingDocument

A MappingDocument defines mappings between InfoTechCanon concepts and external standards, regulations, tools, vocabularies, or product schemas.

11.16 AssimilationReport

An AssimilationReport documents the analysis of an external body of knowledge, including extracted concepts, gaps, conflicts, mappings, and proposed canon changes.

11.17 DecisionRecord

A DecisionRecord records a decision, context, options, rationale, consequences, and review trigger.

11.18 AgentBrief

An AgentBrief is a compact, retrieval-optimized document summarizing a standard, profile, pattern, or subsystem for AI-agent use.

11.19 Section

A Section is a named portion of a document.

11.20 Heading

A Heading is a section title used for human navigation and machine chunking.

11.21 Anchor

An Anchor is a stable target within an artifact.

Anchors SHOULD remain stable across non-breaking edits.

11.22 Block

A Block is a structurally meaningful piece of content such as a paragraph, list, table, code block, callout, or diagram block.

11.23 Chunk

A Chunk is a segment of content prepared for retrieval, indexing, embedding, citation, or context assembly.

Canonical rule:

A Chunk SHOULD preserve enough context to be meaningful when retrieved independently.

11.24 RetrievalUnit

A RetrievalUnit is a retrievable unit of knowledge.

A retrieval unit may be:

document
section
chunk
concept page
pattern
profile
mapping
example
agent brief

11.25 Summary

A Summary is a compressed representation of an artifact or retrieval unit.

11.26 Excerpt

An Excerpt is a quoted or extracted part of a source.

Excerpts SHOULD preserve source reference and usage constraints.

11.27 Link

A Link is a directed reference from one artifact or section to another.

11.28 Backlink

A Backlink is an inverse view of a link.

11.29 CrossReference

A CrossReference is a link between related artifacts, concepts, patterns, profiles, or sections.

11.30 Citation

A Citation is a reference to a source used to support a claim, definition, mapping, or statement.

11.31 SourceReference

A SourceReference identifies the source from which information was derived, quoted, summarized, mapped, or assimilated.

11.32 ExternalReference

An ExternalReference points to a source outside the information space.

11.33 MappingReference

A MappingReference points from an artifact to a mapping record or external concept mapping.

11.34 DependencyReference

A DependencyReference indicates that one artifact depends on another for meaning, validity, or interpretation.

11.35 FrontMatter

FrontMatter is structured metadata embedded at the beginning of a Markdown document.

11.36 MetadataRecord

A MetadataRecord is structured data describing an artifact, section, chunk, index entry, or retrieval unit.

11.37 Identifier

An Identifier is a stable reference string for an artifact or entity.

Recommended properties:

stable
unique within namespace
human-readable when practical
machine-parseable
version-aware where needed

11.38 Namespace

A Namespace is a naming scope used to prevent identifier collisions.

Examples:

itc-core
itc-land
itc-org
itc-gov
itc-task
itc-tag
itc-access
itc-sec
itc-data
itc-devsecops
itc-net
itc-obs
itc-infospace

11.39 Label

A Label is a human-readable name for an artifact or concept.

11.40 Alias

An Alias is an alternative label or name.

11.41 Status

A Status indicates lifecycle or review state.

11.42 VersionRecord

A VersionRecord records artifact version, change, compatibility, and supersession information.

11.43 Index

An Index is a structured access path into an information space.

Examples:

concept index
standard index
pattern index
mapping index
profile index
source index
status index
external standard index

11.44 SearchIndex

A SearchIndex supports lexical or semantic search.

11.45 VectorIndex

A VectorIndex supports embedding-based retrieval.

11.46 EmbeddingRecord

An EmbeddingRecord stores or references an embedding for a retrieval unit.

Recommended attributes:

retrieval_unit:
embedding_model:
embedding_version:
created_at:
source_hash:
chunking_strategy:

11.47 RetrievalQuery

A RetrievalQuery is a query used to find relevant artifacts or retrieval units.

11.48 RetrievalResult

A RetrievalResult is the result of a retrieval query.

It SHOULD preserve rank, source, retrieval unit, score, and snippet/context where possible.

11.49 RetrievalContext

A RetrievalContext is the assembled set of retrieval results, summaries, and metadata used for human or agent work.

11.50 RetrievalEvaluation

A RetrievalEvaluation assesses retrieval quality.

Examples:

relevance
coverage
freshness
precision
recall
source diversity
citation correctness
staleness

11.51 ProvenanceRecord

A ProvenanceRecord documents how an artifact, concept, mapping, or retrieval unit came to exist or change.

11.52 Source

A Source is an origin of information.

Examples:

external standard
uploaded document
web page
internal decision
agent-generated draft
manual authoring
assimilation report

11.53 Activity

An Activity is an action that generated, modified, reviewed, mapped, assimilated, or published an artifact.

11.54 AgentReference

An AgentReference points to a human, software agent, organization, or tool responsible for or involved in an activity.

11.55 Generation

A Generation records creation of an artifact or retrieval unit.

11.56 Revision

A Revision records modification of an artifact.

11.57 Influence

An Influence records that one source, artifact, activity, or agent influenced another.

11.58 ReviewRecord

A ReviewRecord records review activity, reviewer, outcome, and comments.

11.59 NavigationView

A NavigationView is a human-oriented view for browsing an information space.

11.60 TopicIndex

A TopicIndex organizes artifacts by topic.

11.61 ConceptIndex

A ConceptIndex organizes concept pages or concept definitions.

11.62 RelationshipIndex

A RelationshipIndex organizes relationships among artifacts or concepts.

11.63 MapView

A MapView visualizes or lists relationships, dependencies, domains, or concept mappings.

11.64 GraphView

A GraphView represents the information space as nodes and edges.

11.65 UsePath

A UsePath is a guided path through the information space for a common user intent.

Examples:

I want to model a new subsystem.
I want to map an external standard.
I want to create a task-tag profile.
I want to onboard an agent.
I want to check conformance.

11.66 BrokenLink

A BrokenLink is a link whose target cannot be resolved.

11.67 DuplicateContent

DuplicateContent is overlapping or repeated content that may create drift.

11.68 StaleContent

StaleContent is content whose age, supersession state, or source drift reduces trust.

11.69 ConflictingDefinition

A ConflictingDefinition is a contradiction between definitions or concept usage.

11.70 MissingMetadata

MissingMetadata is required or expected metadata that is absent.

11.71 LowRetrievalQuality

LowRetrievalQuality indicates poor retrieval performance for a query, use path, or artifact set.

11.72 OrphanArtifact

An OrphanArtifact is an artifact with no incoming links, no index membership, no owner, or no retrieval path.

12. Core Relationship Vocabulary

Recommended root relationship types:

contains
part_of
defines
describes
summarizes
references
cites
links_to
backlinks_to
maps_to
depends_on
derived_from
generated_by
revised_by
reviewed_by
influenced_by
supersedes
deprecated_by
chunks_into
indexed_by
retrieved_by
embedded_as
has_view
belongs_to_space
belongs_to_collection
evidenced_by

Relationship records SHOULD support:

id:
relationship_type:
source_entity:
target_entity:
scope:
valid_from:
valid_to:
source_system:
confidence:
evidence:
rationale:

13. Information Space State Models

13.1 Artifact States

raw
captured
draft
reviewed
candidate
canonical
deprecated
superseded
archived
deleted

13.2 Retrieval Unit States

active
stale
invalidated
deprecated
superseded
excluded
needs_rechunking

13.3 Link States

active
broken
redirected
deprecated
ambiguous
external_unverified

13.4 Review States

not_reviewed
under_review
reviewed
changes_requested
approved
rejected
needs_revalidation

13.5 Index States

fresh
stale
partial
rebuilding
failed
deprecated

14. Information Space Patterns

14.1 Pattern: Markdown with Structured Front Matter

Context: Humans need readable documents, while tools need structured metadata.

Problem: Pure prose is hard to index and validate; pure data is hard to author.

Solution: Use Markdown for content and front matter for structured metadata.

14.2 Pattern: Concept Page per Canonical Concept

Context: Concepts need stable definitions.

Problem: Definitions drift when scattered across documents.

Solution: Create one canonical concept page per important concept and link other documents to it.

14.3 Pattern: Chunk with Parent Context

Context: Agents retrieve chunks of documents.

Problem: Retrieved chunks lose meaning if separated from document context.

Solution: Each chunk should preserve parent artifact, section path, heading, concept identifiers, source, and version.

14.4 Pattern: Agent Brief

Context: Agents need compact guidance.

Problem: Full standards are too large for routine retrieval.

Solution: Provide agent briefs summarizing scope, owned concepts, imports, do/do-not rules, patterns, and examples.

Context: Humans and agents approach the canon with tasks, not just topics.

Problem: A concept index alone does not explain where to start.

Solution: Provide UsePath documents that guide common activities through relevant standards, patterns, profiles, and examples.

14.6 Pattern: Source-Carrying Summary

Context: Summaries are useful but may detach from evidence.

Problem: Unsourced summaries become untrustworthy.

Solution: Summaries SHOULD retain source references, provenance, generation activity, and review state.

14.7 Pattern: Mapping as Linkable Artifact

Context: External standards and internal concepts must stay aligned.

Problem: Mapping notes buried in prose cannot be maintained.

Solution: Represent mappings as first-class artifacts with source concept, target concept, mapping type, scope, confidence, rationale, and version.

14.8 Pattern: Assimilation Folder

Context: New external knowledge bodies must be digested.

Problem: Research notes disappear after standards are updated.

Solution: Each assimilation should produce a folder with source summary, extracted concepts, comparison matrix, mappings, proposed changes, and open questions.

14.9 Pattern: View Not Source

Context: Generated indexes and diagrams are useful.

Problem: Teams edit generated views as if they were canonical source.

Solution: Mark generated views clearly and regenerate them from canonical artifacts.

14.10 Pattern: Retrieval Quality Loop

Context: Agents depend on retrieval.

Problem: Retrieval failures cause hallucination, contradiction, or stale answers.

Solution: Track retrieval queries, expected results, misses, stale hits, duplicate hits, and quality fixes.

15. Information Space Profiles

15.1 Profile Format

An Information Space Profile SHALL declare:

id:
profile_name:
status:
implements:
  - InfoTechCanonInformationSpaceModel
target_context:
included_concepts:
required_metadata:
required_indexes:
chunking_rules:
source_of_truth_rules:
mapping_files:
validation_rules:
examples:
known_deviations:

15.2 Seed Profile: InfoTechCanon Repository Profile

Purpose:

Define the expected structure for the info-tech-canon repository.

Required top-level files:

README.md
INTENT.md
SCOPE.md
canon.yaml

Recommended directories:

standards/
patterns/
profiles/
mappings/
assimilation/
schemas/
views/
agent/
examples/
validation/

Required indexes:

by-standard
by-concept
by-pattern
by-profile
by-mapping-target
by-status
use-paths

15.3 Seed Profile: Markdown Infospace Profile

Purpose:

Define a general profile for markdown-first knowledge spaces.

Required concepts:

MarkdownDocument
FrontMatter
Section
Anchor
Link
Backlink
Index
RetrievalUnit
ProvenanceRecord

Recommended front matter:

id:
title:
type:
status:
owner:
created_at:
updated_at:
tags:
related:
sources:

15.4 Seed Profile: Agent-Retrievable Standards Profile

Purpose:

Make standards retrievable and usable by AI agents.

Required artifacts:

standard.md
agent-brief.md
concept index
relationship index
profile index
mapping index
examples
validation rules

Chunking rules:

chunk by major section
preserve heading path
preserve artifact id
preserve concept ids
include summary chunks
exclude generated noise

15.5 Seed Profile: Assimilation Workspace Profile

Purpose:

Define how external bodies of knowledge are analyzed and assimilated.

Required files:

ASSIMILATION.md
source-summary.md
extracted-concepts.yaml
comparison-matrix.md
mappings.yaml
proposed-changes.md
open-questions.md

15.6 Seed Profile: Sharded Wiki Profile

Purpose:

Support federated markdown knowledge spaces where multiple shards attach around shared root entities.

Included concepts:

Shard
ShardRoot
Overlay
RemoteReference
PatchProposal
MergeRequestReference
CachedArtifact
ShardBoundary

Known deviations:

Shard synchronization and merge mechanics are implementation-specific.

15.7 Seed Profile: RAG Corpus Profile

Purpose:

Prepare an information space for retrieval-augmented generation.

Included concepts:

Corpus
RetrievalUnit
Chunk
EmbeddingRecord
SearchIndex
VectorIndex
RetrievalQuery
RetrievalResult
RetrievalEvaluation

Required metadata:

source id
artifact id
section path
chunk id
version
source hash
embedding model
created_at

16. Mapping Model for the Information Space Standard

Mappings relate InfoTechCanon information-space concepts to external standards, frameworks, and tools.

16.1 Mapping Types

Recommended mapping types:

exactMatch
closeMatch
broadMatch
narrowMatch
relatedMatch
conflictMatch
gapMatch
derivedFrom
regulatoryReference
toolEquivalent

16.2 Mapping Record

Example:

id: itc-map:concept-page-to-skos-concept
source_concept: itc-infospace:ConceptPage
target_body: SKOS
target_version: "2009"
target_concept: skos:Concept
mapping_type: relatedMatch
scope:
  - knowledge organization and concept documentation
not_valid_for:
  - all SKOS semantic constraints
rationale: >
  A ConceptPage documents an InfoTechCanon concept, while skos:Concept
  represents a conceptual resource in a concept scheme. They are related but not identical:
  the page is a documentation artifact, the concept is the meaning being documented.
confidence: medium
status: candidate
owner: InfoTechCanonInformationSpaceModel

16.3 Seed Mapping Targets

The Information Space Model SHOULD maintain mappings to:

SKOS
FAIR principles
PROV-O
Dublin Core
Singapore Framework for Dublin Core Application Profiles
DCAT
RDF / JSON-LD
Markdown / CommonMark
YAML front matter conventions
Git repository concepts
static site generator concepts
Obsidian / wiki-link conventions
Zettelkasten note patterns
DITA topic concepts
schema.org CreativeWork / Dataset
RAG / vector index tool schemas

17. Assimilation Hooks

The Information Space Model SHALL be able to receive new knowledge-organization, metadata, documentation, retrieval, and wiki systems through the InfoTechCanon assimilation process.

17.1 Assimilation Triggers

Assimilation may be triggered by:

new metadata standard
new knowledge organization model
new wiki engine
new markdown convention
new documentation generator
new RAG architecture
new retrieval evaluation method
new citation model
new provenance standard
new agent context-management pattern

17.2 Information Space Assimilation Output

An information-space assimilation SHOULD produce:

source summary
extracted information-space concepts
concept comparison matrix
gap list
conflict list
mapping file
candidate new concepts
candidate relationship changes
candidate pattern changes
candidate profile changes
open questions

17.3 Recommended First Assimilation Candidates

SKOS
FAIR principles
PROV-O
Dublin Core / Singapore Framework
CommonMark / Markdown conventions
Obsidian / wiki-link practice
Zettelkasten note practice
DITA topic architecture
RAG corpus and chunking patterns
static site generator metadata conventions

18. Integration with Other InfoTechCanon Standards

18.1 Core

Information Space uses Core concepts for:

Concept
Standard
Pattern
Profile
Mapping
Assimilation
Version
Conformance
CanonicalOwner

18.2 Tagging

Information Space uses tags for:

topic
status
artifact type
domain
mapping target
retrieval group

18.3 Data

Data treats corpora, indexes, embeddings, and retrieval results as data assets where needed.

18.4 Governance

Governance applies to:

review state
approval
evidence
publication status
deprecation
retention
access policy

18.5 DevSecOps

DevSecOps tracks:

repository changes
build generation
publication pipelines
index generation
release of standards

18.6 Observability

Observability tracks:

retrieval quality
index freshness
broken links
agent usage
search failures

18.7 Security and Access Control

Security and Access Control apply to:

sensitive documents
restricted knowledge
credentials in documentation
agent access to knowledge
index access
retrieval audit

19. Canon Interface Card Usage

Subsystems that implement or produce information-space knowledge SHOULD publish a Canon Interface Card.

Example:

subsystem: markitect-tool
implements:
  - InfoTechCanonInformationSpaceModel
  - MarkdownInfospaceProfile
produces:
  - MarkdownDocument
  - FrontMatter
  - Index
  - RetrievalUnit
  - Link
  - ValidationResult
consumes:
  - StandardDocument
  - ConceptPage
  - MappingDocument
relations:
  - MarkdownDocument chunks_into RetrievalUnit
  - RetrievalUnit indexed_by SearchIndex
  - Link references KnowledgeArtifact
source_of_truth:
  markdown_artifacts: git_repository
known_deviations:
  - embedding storage may be external
  - generated indexes may be rebuilt from source

20. Retrieval Requirements

The Information Space Model is itself designed for retrieval.

20.1 Required Retrieval Properties

Every major artifact SHOULD provide:

stable identifier,
stable title,
artifact type,
status,
owner or steward,
source references,
related artifacts,
headings,
anchors,
summary,
front matter,
and retrievable sections.

20.2 Agent Brief

A mature Information Space Model SHOULD include an agent-brief.md file with:

purpose
scope
owned concepts
imported concepts
artifact types
front matter rules
chunking rules
retrieval rules
do / do not rules
common mistakes
profile list
mapping list

20.3 Indexes

The information space SHOULD provide indexes by:

artifact
concept
standard
pattern
profile
mapping
source
status
owner
tag
external reference
retrieval unit
use path

21. Conformance Levels

21.1 Reference-Conformant

A repository or document set is reference-conformant if it uses Information Space terminology consistently but does not implement structured metadata or validation rules.

21.2 Metadata-Conformant

A repository or document set is metadata-conformant if major artifacts have structured metadata and stable identifiers.

21.3 Link-Conformant

A repository or document set is link-conformant if internal links, backlinks, citations, and references are represented and checkable.

21.4 Retrieval-Conformant

A repository or document set is retrieval-conformant if artifacts are chunked, indexed, and retrievable with stable source context.

21.5 Provenance-Conformant

A repository or document set is provenance-conformant if artifacts and important changes preserve source, activity, agent, and review records.

21.6 Profile-Conformant

A repository or document set is profile-conformant if it implements a declared Information Space Profile and passes its validation rules.

21.7 Assimilation-Conformant

A repository or document set is assimilation-conformant if it can represent assimilation workspaces and produce mappings, gaps, conflicts, and proposed changes.

22. Validation Rules

Initial validation rules:

VAL-INFOSPACE-001: Every major KnowledgeArtifact SHOULD have a stable id.

VAL-INFOSPACE-002: Every StandardDocument SHOULD declare status, version, owner, and scope.

VAL-INFOSPACE-003: Every ConceptPage SHOULD define exactly one primary concept.

VAL-INFOSPACE-004: Generated views SHOULD be marked as generated or derived.

VAL-INFOSPACE-005: Internal links SHOULD resolve to existing artifacts or anchors.

VAL-INFOSPACE-006: External references SHOULD include source, access date or source version where relevant.

VAL-INFOSPACE-007: RetrievalUnit SHOULD preserve artifact id, section path, version, and source context.

VAL-INFOSPACE-008: EmbeddingRecord SHOULD reference source hash, embedding model, and chunking strategy.

VAL-INFOSPACE-009: Summary SHOULD reference the artifact or retrieval unit it summarizes.

VAL-INFOSPACE-010: AgentBrief SHOULD be derived from or reviewed against the full artifact.

VAL-INFOSPACE-011: AssimilationReport SHOULD include source summary, extracted concepts, comparison matrix, mappings, proposed changes, and open questions.

VAL-INFOSPACE-012: MappingDocument SHOULD declare source concept, target body, target concept, mapping type, scope, confidence, and rationale.

VAL-INFOSPACE-013: Deprecated artifacts SHOULD reference replacements where available.

VAL-INFOSPACE-014: Orphan artifacts SHOULD be reviewed for indexing, linking, archiving, or deletion.

VAL-INFOSPACE-015: Conflicting definitions SHOULD create review work or mapping notes.

VAL-INFOSPACE-016: Sensitive knowledge artifacts SHOULD reference Access Control, Security, Data, or Governance constraints where relevant.

VAL-INFOSPACE-017: Tags MUST NOT replace stable identifiers, links, mappings, or metadata.

VAL-INFOSPACE-018: Profiles MUST NOT redefine canonical concepts. They may constrain them.

23. Anti-Patterns

23.1 Markdown Pile

A folder full of Markdown files without stable IDs, indexes, links, or metadata.

23.2 Chunk Soup

Chunks created for retrieval without preserving document context, heading path, source, or version.

23.3 Summary Without Source

Summaries detached from the source artifacts they summarize.

23.4 Link Rot Inside the Repo

Internal links break because anchors and file paths are not validated.

23.5 View as Source

Generated indexes or diagrams are edited manually and diverge from canonical artifacts.

23.6 Embedding Without Provenance

Embeddings are stored without model, source hash, chunking strategy, or creation time.

23.7 Concept Drift by Duplication

The same concept is defined in multiple places without canonical ownership.

23.8 Agent Brief as Replacement

Agents use compact briefs that are stale or inconsistent with full standards.

23.9 Retrieval Without Evaluation

Search and RAG are used without tests for relevance, freshness, and citation correctness.

23.10 External Standard Copy-Paste

External standards are copied into the information space without mapping, assimilation, or source boundaries.

24. Initial Repository Placement

Recommended repository layout:

info-tech-canon/
  standards/
    information-space/
      InfoTechCanonInformationSpaceModel.md
      agent-brief.md
      concepts/
      relationships/
      patterns/
      profiles/
      mappings/
      assimilation/
      examples/
      validation/

Seed files:

standards/information-space/InfoTechCanonInformationSpaceModel.md
standards/information-space/agent-brief.md
standards/information-space/concepts/information-space.md
standards/information-space/concepts/knowledge-artifact.md
standards/information-space/concepts/retrieval-unit.md
standards/information-space/concepts/chunk.md
standards/information-space/concepts/index.md
standards/information-space/concepts/agent-brief.md
standards/information-space/concepts/provenance-record.md
standards/information-space/patterns/markdown-with-structured-front-matter.md
standards/information-space/patterns/concept-page-per-canonical-concept.md
standards/information-space/patterns/chunk-with-parent-context.md
standards/information-space/patterns/agent-brief.md
standards/information-space/profiles/infotechcanon-repository-profile.md
standards/information-space/profiles/markdown-infospace-profile.md
standards/information-space/profiles/agent-retrievable-standards-profile.md
standards/information-space/profiles/assimilation-workspace-profile.md
standards/information-space/profiles/rag-corpus-profile.md
standards/information-space/mappings/skos.yaml
standards/information-space/mappings/fair.yaml
standards/information-space/mappings/prov-o.yaml
standards/information-space/mappings/dublin-core.yaml

25. Roadmap

Phase 1: Seed Stabilization

Establish this standard as InfoTechCanonInformationSpaceModel.
Add seed concepts, relationship vocabulary, patterns, and profiles.
Define validation rules.
Align with Core, Tagging, Data, Governance, DevSecOps, Observability, Security, and Access Control.

Phase 2: First Assimilations

Recommended first assimilations:

SKOS
FAIR principles
PROV-O
Dublin Core / Singapore Framework
CommonMark / Markdown conventions
Obsidian / wiki-link practice
Zettelkasten note practice
DITA topic architecture
RAG corpus and chunking patterns

Phase 3: Profile Maturation

Mature InfoTechCanon Repository Profile.
Mature Markdown Infospace Profile.
Mature Agent-Retrievable Standards Profile.
Mature Assimilation Workspace Profile.
Mature Sharded Wiki Profile.
Mature RAG Corpus Profile.

Phase 4: Tooling Integration

Generate concept indexes.
Generate agent briefs.
Generate chunk manifests.
Generate machine-readable YAML/JSON exports.
Add validation scripts.
Add broken-link checks.
Add stale-content checks.
Add retrieval-quality tests.
Integrate with markitect-tool, kontextual-engine, shard-wiki, llm-connect, and phase-memory.

Phase 5: Knowledge Intelligence Loop

Track retrieval failures.
Track stale concepts.
Track conflicting definitions.
Track missing mappings.
Track assimilation backlog.
Generate improvement tasks.
Use agent feedback to refine chunks, briefs, indexes, and profiles.

26. Summary

The InfoTechCanon Information Space Model is the seed standard for representing markdown-first, human-readable, machine-retrievable, provenance-aware, interconnected knowledge spaces.

Its most important commitments are:

Separate domain meaning from knowledge-space packaging.

Treat documents, sections, chunks, retrieval units, links, citations, indexes,
summaries, agent briefs, provenance, and mappings as first-class artifacts.

Make markdown useful for both humans and agents through structured metadata,
stable identifiers, chunking rules, source references, and validation.

Map to SKOS, FAIR, PROV-O, Dublin Core, Markdown, and RAG practices
without surrendering internal semantic autonomy.

Use profiles to make the model practical for the InfoTechCanon repository,
markdown infospaces, sharded wikis, assimilation workspaces, and agent retrieval.

This makes the Information Space Model the structural substrate for turning InfoTechCanon from a collection of documents into a living, reusable, agent-operable knowledge system.

44 KiB Executable File Raw Blame History Unescape Escape

InfoTechCanon Information Space Model

1. Purpose

2. Position in InfoTechCanon

3. Boundary with Adjacent Standards

3.1 Boundary with Core

3.2 Boundary with Tagging

3.3 Boundary with Data

3.4 Boundary with Governance

3.5 Boundary with DevSecOps

3.6 Boundary with Observability

4. Research Basis and External Alignment

4.1 SKOS

4.2 FAIR Principles

4.3 PROV-O

4.4 Dublin Core and Application Profiles

4.5 Zettelkasten, Wikis, and Hypertext

4.6 Documentation Systems and Static Site Generators

4.7 Retrieval-Augmented Generation

5. Seed Standard Design Stance

6. Scope

6.1 In Scope

6.2 Out of Scope

7. Normative Language

8. Core Principles

8.1 Markdown-First, Not Markdown-Only

8.2 Human-Readable and Machine-Retrievable

8.3 Stable Identity Is Mandatory for Reuse

8.4 Chunking Is a Design Concern

8.5 Links Are First-Class

8.6 Provenance Is First-Class

8.7 Views Are Not the Model

8.8 Retrieval Must Be Evaluated

8.9 External Standards Are Mapped, Not Obeyed

9. Canonical Seed Metadata

10. Root Information Space Taxonomy

11. Core Concepts

11.1 InformationSpace

11.2 KnowledgeBase

11.3 Infospace

11.4 RepositorySpace

11.5 Shard

11.6 Collection

11.7 Corpus

11.8 KnowledgeArtifact

11.9 Document

11.10 MarkdownDocument

11.11 ConceptPage

11.12 StandardDocument

11.13 PatternDocument

11.14 ProfileDocument

11.15 MappingDocument

11.16 AssimilationReport

11.17 DecisionRecord

11.18 AgentBrief

11.19 Section

11.20 Heading

11.21 Anchor

11.22 Block

11.23 Chunk

11.24 RetrievalUnit

11.25 Summary

11.26 Excerpt

11.27 Link

11.28 Backlink

11.29 CrossReference

11.30 Citation

11.31 SourceReference

11.32 ExternalReference

11.33 MappingReference

11.34 DependencyReference

11.35 FrontMatter

11.36 MetadataRecord

11.37 Identifier

11.38 Namespace

11.39 Label

11.40 Alias

11.41 Status

11.42 VersionRecord

11.43 Index

44 KiB

Executable File

Raw Blame History