44 KiB
Executable File
InfoTechCanon Information Space Model
Short Name: ITC-INFOSPACE
Document Status: Seed Standard Release Candidate 1
Version: RC1-seed
Date: 2026-05-23
Repository Context: info-tech-canon
Document Type: InfoTechCanon Domain Standard
Intended Audience: Knowledge-system builders, markdown-infospace maintainers, standards authors, AI-agent tool builders, documentation architects, information architects, ontology/taxonomy maintainers, software architects, platform builders, and retrieval-system designers.
1. Purpose
The InfoTechCanon Information Space Model defines a canonical seed model for representing markdown-first, human-readable, machine-retrievable, provenance-aware, interconnected information spaces.
It exists to provide the structural and semantic foundation for using InfoTechCanon as an evolving reference body for humans, agents, tools, and services.
This standard owns the concepts required to make a body of knowledge:
- navigable,
- retrievable,
- reusable,
- linkable,
- citable,
- chunkable,
- versionable,
- mappable,
- provenance-aware,
- profile-aware,
- and suitable for both human reading and agentic use.
It provides a canonical vocabulary for:
- information spaces,
- knowledge artifacts,
- markdown documents,
- concept pages,
- standard documents,
- chunks,
- sections,
- anchors,
- identifiers,
- links,
- backlinks,
- citations,
- references,
- indexes,
- summaries,
- agent briefs,
- retrieval units,
- embeddings,
- metadata,
- provenance,
- mappings,
- assimilation records,
- views,
- navigation structures,
- and knowledge-quality signals.
2. Position in InfoTechCanon
The Information Space Model is a domain standard within InfoTechCanon.
It should serve as the structural substrate for how all other standards are stored, retrieved, navigated, linked, mapped, and reused.
InfoTechCanon
├── InfoTechCanonCore
├── InfoTechCanonInformationSpaceModel <-- this standard
├── InfoTechCanonLandscapeModel
├── InfoTechCanonOrganizationModel
├── InfoTechCanonGovernanceModel
├── InfoTechCanonTaskModel
├── InfoTechCanonTaggingStandard
├── InfoTechCanonAccessControlModel
├── InfoTechCanonSecurityModel
├── InfoTechCanonDataModel
├── InfoTechCanonDevSecOpsModel
├── InfoTechCanonNetworkModel
├── InfoTechCanonObservabilityModel
├── InfoTechCanonPatternLanguage
└── Application Profiles
The dependency role is:
Domain standards define meaning.
Information Space defines how meaning is packaged, linked, indexed, retrieved, cited, and reused.
3. Boundary with Adjacent Standards
3.1 Boundary with Core
InfoTechCanonCore should own generic canon mechanisms:
Concept
Standard
Pattern
Profile
Mapping
Assimilation
Versioning
Conformance
Canonical Owner
The Information Space Model owns the storage, retrieval, navigation, chunking, and documentation structures used to operationalize those mechanisms.
3.2 Boundary with Tagging
The Tagging Standard owns tag identity, schemes, namespaces, assignments, and validation.
The Information Space Model uses tags for navigation and retrieval but does not define tag semantics.
3.3 Boundary with Data
The Data Model owns datasets, schemas, data products, lineage, data contracts, and data quality.
The Information Space Model owns knowledge artifacts, documents, chunks, indexes, and retrieval units.
A corpus of Markdown files may be treated as data by the Data Model, but the information-space semantics are owned here.
3.4 Boundary with Governance
Governance owns policies, controls, decisions, exceptions, evidence, and assurance.
The Information Space Model owns how governance documents, evidence references, citations, and versioned documentation artifacts are structured and retrieved.
3.5 Boundary with DevSecOps
DevSecOps owns source repositories, commits, pipelines, releases, deployments, SBOMs, and attestations.
The Information Space Model may be implemented in Git and linked to DevSecOps records, but it owns the knowledge-space structure.
3.6 Boundary with Observability
Observability owns telemetry, signals, metrics, logs, traces, alerts, and operational evidence.
The Information Space Model owns knowledge artifacts and retrieval structures, not runtime telemetry.
4. Research Basis and External Alignment
This seed standard draws on several knowledge organization and metadata traditions.
4.1 SKOS
SKOS defines a common data model for sharing and linking knowledge organization systems such as thesauri, taxonomies, classification schemes, and subject-heading systems. It provides useful concepts such as concept schemes, preferred labels, alternative labels, broader/narrower relations, related relations, and mapping relations.
4.2 FAIR Principles
The FAIR principles emphasize that digital assets should be Findable, Accessible, Interoperable, and Reusable. They are especially relevant because InfoTechCanon must be useful to both humans and machines.
4.3 PROV-O
PROV-O models provenance through entities, activities, and agents. This is central for tracking how knowledge artifacts, mappings, assimilations, and standards evolved.
4.4 Dublin Core and Application Profiles
Dublin Core and the Singapore Framework for application profiles distinguish reusable metadata vocabularies from application-specific profiles. This directly supports InfoTechCanon’s distinction between canonical concepts and concrete profiles.
4.5 Zettelkasten, Wikis, and Hypertext
Zettelkasten and wiki traditions emphasize durable notes, links, backlinks, local context, reuse, and emergent structure. They are useful for the human side of markdown-first information spaces.
4.6 Documentation Systems and Static Site Generators
Modern documentation systems emphasize stable headings, cross-links, front matter, sidebars, indexes, search, versioned docs, and generated navigation. These are practical implementation targets.
4.7 Retrieval-Augmented Generation
RAG systems require chunking, metadata, embeddings, summaries, stable identifiers, source references, and retrieval-quality evaluation. The Information Space Model should make retrieval a first-class design concern rather than an afterthought.
5. Seed Standard Design Stance
This standard is a seed standard, not a complete CMS, ontology, or retrieval-engine specification.
It shall:
- define canonical information-space semantics,
- remain markdown-first,
- support human navigation and agent retrieval,
- support stable identifiers and anchors,
- support chunking and retrieval units,
- support citations, references, provenance, and versioning,
- support indexes, views, summaries, and agent briefs,
- support mappings and assimilation records,
- map to external standards without becoming subordinate to them,
- support future integration with markdown infobase tools and services.
6. Scope
6.1 In Scope
This standard covers canonical representation of:
- information spaces,
- knowledge bases,
- infospaces,
- repositories as knowledge spaces,
- markdown documents,
- concept pages,
- standard documents,
- pattern documents,
- profile documents,
- mapping documents,
- assimilation reports,
- decision records,
- sections,
- headings,
- anchors,
- stable identifiers,
- chunks,
- retrieval units,
- summaries,
- agent briefs,
- front matter,
- metadata records,
- links,
- backlinks,
- citations,
- references,
- external references,
- indexes,
- navigation views,
- document collections,
- shard references,
- source references,
- provenance records,
- version records,
- quality signals,
- embedding records,
- retrieval queries,
- retrieval results,
- and reuse contexts.
6.2 Out of Scope
This standard does not fully define:
- all semantic-web ontology modeling,
- all RDF/OWL reasoning,
- full CMS implementation,
- full search-engine ranking,
- all embedding algorithms,
- full vector-database implementation,
- all document authoring standards,
- all bibliography formats,
- all legal citation systems,
- all software repository structures,
- or every markdown dialect.
Those may be mapped, assimilated, profiled, or handled by adjacent standards.
7. Normative Language
The following terms are used normatively:
- SHALL indicates a mandatory rule for conformance.
- SHOULD indicates a recommended practice.
- MAY indicates an optional capability.
- MUST NOT indicates a prohibited practice.
- SEED marks a concept defined provisionally here but open to later refinement.
- EXTRACT marks a concept that may later move to a more specialized standard.
8. Core Principles
8.1 Markdown-First, Not Markdown-Only
The canonical working format SHOULD be Markdown with structured metadata, but the model SHOULD allow export to JSON, YAML, RDF, JSON-LD, graph databases, search indexes, and vector stores.
8.2 Human-Readable and Machine-Retrievable
Every important artifact SHOULD be readable by humans and retrievable by machines.
8.3 Stable Identity Is Mandatory for Reuse
Artifacts, sections, concepts, mappings, profiles, and retrieval units SHOULD have stable identifiers.
8.4 Chunking Is a Design Concern
Documents SHOULD be structured so that retrieval chunks preserve meaning, context, and source traceability.
8.5 Links Are First-Class
Links, backlinks, references, citations, and mappings SHOULD be explicit and queryable where possible.
8.6 Provenance Is First-Class
Knowledge artifacts SHOULD preserve source, author, generator, change activity, review state, and influence where meaningful.
8.7 Views Are Not the Model
Indexes, navigation pages, diagrams, generated views, and dashboards are views over the information space, not the underlying knowledge itself.
8.8 Retrieval Must Be Evaluated
Useful information spaces SHOULD support retrieval-quality checks, stale-content detection, broken-link detection, and duplicate/conflict detection.
8.9 External Standards Are Mapped, Not Obeyed
The Information Space Model MAY map to SKOS, FAIR, PROV-O, Dublin Core, DCAT, Markdown, DITA, RDF, JSON-LD, static-site generators, and RAG tooling patterns.
It MUST NOT subordinate its internal semantics to any single external model.
9. Canonical Seed Metadata
Every information-space artifact SHOULD support structured metadata.
Recommended front matter:
---
id: itc-infospace:KnowledgeArtifact
type: concept
standard: InfoTechCanonInformationSpaceModel
standard_version: RC1-seed
status: candidate
canonical_owner: InfoTechCanonInformationSpaceModel
preferred_label: Knowledge Artifact
related:
- itc-infospace:InformationSpace
- itc-infospace:Document
- itc-infospace:RetrievalUnit
- itc-infospace:ProvenanceRecord
mappings:
- itc-map:knowledge-artifact-to-prov-entity
---
Recommended artifact statuses:
idea
draft
candidate
release-candidate
adopted
stable
deprecated
retired
Recommended content statuses:
raw
captured
draft
reviewed
candidate
canonical
deprecated
superseded
archived
10. Root Information Space Taxonomy
InformationSpaceEntity
├── SpaceEntity
│ ├── InformationSpace
│ ├── KnowledgeBase
│ ├── Infospace
│ ├── RepositorySpace
│ ├── Shard
│ ├── Collection
│ └── Corpus
├── ArtifactEntity
│ ├── KnowledgeArtifact
│ ├── Document
│ ├── MarkdownDocument
│ ├── ConceptPage
│ ├── StandardDocument
│ ├── PatternDocument
│ ├── ProfileDocument
│ ├── MappingDocument
│ ├── AssimilationReport
│ ├── DecisionRecord
│ └── AgentBrief
├── StructureEntity
│ ├── Section
│ ├── Heading
│ ├── Anchor
│ ├── Block
│ ├── Chunk
│ ├── RetrievalUnit
│ ├── Summary
│ └── Excerpt
├── LinkEntity
│ ├── Link
│ ├── Backlink
│ ├── CrossReference
│ ├── Citation
│ ├── SourceReference
│ ├── ExternalReference
│ ├── MappingReference
│ └── DependencyReference
├── MetadataEntity
│ ├── FrontMatter
│ ├── MetadataRecord
│ ├── Identifier
│ ├── Namespace
│ ├── Label
│ ├── Alias
│ ├── Status
│ └── VersionRecord
├── RetrievalEntity
│ ├── Index
│ ├── SearchIndex
│ ├── VectorIndex
│ ├── EmbeddingRecord
│ ├── RetrievalQuery
│ ├── RetrievalResult
│ ├── RetrievalContext
│ └── RetrievalEvaluation
├── ProvenanceEntity
│ ├── ProvenanceRecord
│ ├── Source
│ ├── Activity
│ ├── AgentReference
│ ├── Generation
│ ├── Revision
│ ├── Influence
│ └── ReviewRecord
├── ViewEntity
│ ├── NavigationView
│ ├── TopicIndex
│ ├── ConceptIndex
│ ├── RelationshipIndex
│ ├── MapView
│ ├── GraphView
│ └── UsePath
└── QualityEntity
├── BrokenLink
├── DuplicateContent
├── StaleContent
├── ConflictingDefinition
├── MissingMetadata
├── LowRetrievalQuality
└── OrphanArtifact
11. Core Concepts
11.1 InformationSpace
An InformationSpace is a bounded, navigable, retrievable, and evolving body of knowledge.
Examples:
InfoTechCanon repository
project wiki
standards library
markdown knowledge base
research corpus
documentation space
agent-readable context repository
11.2 KnowledgeBase
A KnowledgeBase is an information space organized to preserve, retrieve, and reuse knowledge.
11.3 Infospace
An Infospace is a structured knowledge environment optimized for navigation, recombination, retrieval, and reuse.
In InfoTechCanon, an infospace is expected to be markdown-first and machine-indexable.
11.4 RepositorySpace
A RepositorySpace is an information space backed by a source repository.
11.5 Shard
A Shard is an independently maintained portion of an information space that can be attached, federated, cached, or overlaid with other shards.
This concept supports shard-wiki-like federation.
11.6 Collection
A Collection is a curated group of knowledge artifacts.
11.7 Corpus
A Corpus is a body of documents or artifacts used for search, retrieval, analysis, or training-like reference.
11.8 KnowledgeArtifact
A KnowledgeArtifact is any identifiable artifact that carries reusable knowledge.
Examples:
standard document
concept page
pattern document
profile
mapping file
assimilation report
decision record
agent brief
example
schema
diagram
11.9 Document
A Document is a knowledge artifact primarily represented as ordered textual or structured content.
11.10 MarkdownDocument
A MarkdownDocument is a document represented in Markdown or a Markdown-compatible dialect.
11.11 ConceptPage
A ConceptPage is a document or section that defines and explains one canonical concept.
11.12 StandardDocument
A StandardDocument is a document defining a standard, its scope, concepts, relationships, validation rules, mappings, and profiles.
11.13 PatternDocument
A PatternDocument is a document describing a recurring problem, forces, solution, resulting context, variants, and related patterns.
11.14 ProfileDocument
A ProfileDocument is a document defining constraints and implementation guidance for a specific context.
11.15 MappingDocument
A MappingDocument defines mappings between InfoTechCanon concepts and external standards, regulations, tools, vocabularies, or product schemas.
11.16 AssimilationReport
An AssimilationReport documents the analysis of an external body of knowledge, including extracted concepts, gaps, conflicts, mappings, and proposed canon changes.
11.17 DecisionRecord
A DecisionRecord records a decision, context, options, rationale, consequences, and review trigger.
11.18 AgentBrief
An AgentBrief is a compact, retrieval-optimized document summarizing a standard, profile, pattern, or subsystem for AI-agent use.
Recommended content:
purpose
scope
owned concepts
imported concepts
do / do not rules
common mistakes
minimal examples
mapping targets
validation hints
11.19 Section
A Section is a named portion of a document.
11.20 Heading
A Heading is a section title used for human navigation and machine chunking.
11.21 Anchor
An Anchor is a stable target within an artifact.
Anchors SHOULD remain stable across non-breaking edits.
11.22 Block
A Block is a structurally meaningful piece of content such as a paragraph, list, table, code block, callout, or diagram block.
11.23 Chunk
A Chunk is a segment of content prepared for retrieval, indexing, embedding, citation, or context assembly.
Canonical rule:
A Chunk SHOULD preserve enough context to be meaningful when retrieved independently.
11.24 RetrievalUnit
A RetrievalUnit is a retrievable unit of knowledge.
A retrieval unit may be:
document
section
chunk
concept page
pattern
profile
mapping
example
agent brief
11.25 Summary
A Summary is a compressed representation of an artifact or retrieval unit.
11.26 Excerpt
An Excerpt is a quoted or extracted part of a source.
Excerpts SHOULD preserve source reference and usage constraints.
11.27 Link
A Link is a directed reference from one artifact or section to another.
11.28 Backlink
A Backlink is an inverse view of a link.
11.29 CrossReference
A CrossReference is a link between related artifacts, concepts, patterns, profiles, or sections.
11.30 Citation
A Citation is a reference to a source used to support a claim, definition, mapping, or statement.
11.31 SourceReference
A SourceReference identifies the source from which information was derived, quoted, summarized, mapped, or assimilated.
11.32 ExternalReference
An ExternalReference points to a source outside the information space.
11.33 MappingReference
A MappingReference points from an artifact to a mapping record or external concept mapping.
11.34 DependencyReference
A DependencyReference indicates that one artifact depends on another for meaning, validity, or interpretation.
11.35 FrontMatter
FrontMatter is structured metadata embedded at the beginning of a Markdown document.
11.36 MetadataRecord
A MetadataRecord is structured data describing an artifact, section, chunk, index entry, or retrieval unit.
11.37 Identifier
An Identifier is a stable reference string for an artifact or entity.
Recommended properties:
stable
unique within namespace
human-readable when practical
machine-parseable
version-aware where needed
11.38 Namespace
A Namespace is a naming scope used to prevent identifier collisions.
Examples:
itc-core
itc-land
itc-org
itc-gov
itc-task
itc-tag
itc-access
itc-sec
itc-data
itc-devsecops
itc-net
itc-obs
itc-infospace
11.39 Label
A Label is a human-readable name for an artifact or concept.
11.40 Alias
An Alias is an alternative label or name.
11.41 Status
A Status indicates lifecycle or review state.
11.42 VersionRecord
A VersionRecord records artifact version, change, compatibility, and supersession information.
11.43 Index
An Index is a structured access path into an information space.
Examples:
concept index
standard index
pattern index
mapping index
profile index
source index
status index
external standard index
11.44 SearchIndex
A SearchIndex supports lexical or semantic search.
11.45 VectorIndex
A VectorIndex supports embedding-based retrieval.
11.46 EmbeddingRecord
An EmbeddingRecord stores or references an embedding for a retrieval unit.
Recommended attributes:
retrieval_unit:
embedding_model:
embedding_version:
created_at:
source_hash:
chunking_strategy:
11.47 RetrievalQuery
A RetrievalQuery is a query used to find relevant artifacts or retrieval units.
11.48 RetrievalResult
A RetrievalResult is the result of a retrieval query.
It SHOULD preserve rank, source, retrieval unit, score, and snippet/context where possible.
11.49 RetrievalContext
A RetrievalContext is the assembled set of retrieval results, summaries, and metadata used for human or agent work.
11.50 RetrievalEvaluation
A RetrievalEvaluation assesses retrieval quality.
Examples:
relevance
coverage
freshness
precision
recall
source diversity
citation correctness
staleness
11.51 ProvenanceRecord
A ProvenanceRecord documents how an artifact, concept, mapping, or retrieval unit came to exist or change.
11.52 Source
A Source is an origin of information.
Examples:
external standard
uploaded document
web page
internal decision
agent-generated draft
manual authoring
assimilation report
11.53 Activity
An Activity is an action that generated, modified, reviewed, mapped, assimilated, or published an artifact.
11.54 AgentReference
An AgentReference points to a human, software agent, organization, or tool responsible for or involved in an activity.
11.55 Generation
A Generation records creation of an artifact or retrieval unit.
11.56 Revision
A Revision records modification of an artifact.
11.57 Influence
An Influence records that one source, artifact, activity, or agent influenced another.
11.58 ReviewRecord
A ReviewRecord records review activity, reviewer, outcome, and comments.
11.59 NavigationView
A NavigationView is a human-oriented view for browsing an information space.
11.60 TopicIndex
A TopicIndex organizes artifacts by topic.
11.61 ConceptIndex
A ConceptIndex organizes concept pages or concept definitions.
11.62 RelationshipIndex
A RelationshipIndex organizes relationships among artifacts or concepts.
11.63 MapView
A MapView visualizes or lists relationships, dependencies, domains, or concept mappings.
11.64 GraphView
A GraphView represents the information space as nodes and edges.
11.65 UsePath
A UsePath is a guided path through the information space for a common user intent.
Examples:
I want to model a new subsystem.
I want to map an external standard.
I want to create a task-tag profile.
I want to onboard an agent.
I want to check conformance.
11.66 BrokenLink
A BrokenLink is a link whose target cannot be resolved.
11.67 DuplicateContent
DuplicateContent is overlapping or repeated content that may create drift.
11.68 StaleContent
StaleContent is content whose age, supersession state, or source drift reduces trust.
11.69 ConflictingDefinition
A ConflictingDefinition is a contradiction between definitions or concept usage.
11.70 MissingMetadata
MissingMetadata is required or expected metadata that is absent.
11.71 LowRetrievalQuality
LowRetrievalQuality indicates poor retrieval performance for a query, use path, or artifact set.
11.72 OrphanArtifact
An OrphanArtifact is an artifact with no incoming links, no index membership, no owner, or no retrieval path.
12. Core Relationship Vocabulary
Recommended root relationship types:
contains
part_of
defines
describes
summarizes
references
cites
links_to
backlinks_to
maps_to
depends_on
derived_from
generated_by
revised_by
reviewed_by
influenced_by
supersedes
deprecated_by
chunks_into
indexed_by
retrieved_by
embedded_as
has_view
belongs_to_space
belongs_to_collection
evidenced_by
Relationship records SHOULD support:
id:
relationship_type:
source_entity:
target_entity:
scope:
valid_from:
valid_to:
source_system:
confidence:
evidence:
rationale:
13. Information Space State Models
13.1 Artifact States
raw
captured
draft
reviewed
candidate
canonical
deprecated
superseded
archived
deleted
13.2 Retrieval Unit States
active
stale
invalidated
deprecated
superseded
excluded
needs_rechunking
13.3 Link States
active
broken
redirected
deprecated
ambiguous
external_unverified
13.4 Review States
not_reviewed
under_review
reviewed
changes_requested
approved
rejected
needs_revalidation
13.5 Index States
fresh
stale
partial
rebuilding
failed
deprecated
14. Information Space Patterns
14.1 Pattern: Markdown with Structured Front Matter
Context: Humans need readable documents, while tools need structured metadata.
Problem: Pure prose is hard to index and validate; pure data is hard to author.
Solution: Use Markdown for content and front matter for structured metadata.
14.2 Pattern: Concept Page per Canonical Concept
Context: Concepts need stable definitions.
Problem: Definitions drift when scattered across documents.
Solution: Create one canonical concept page per important concept and link other documents to it.
14.3 Pattern: Chunk with Parent Context
Context: Agents retrieve chunks of documents.
Problem: Retrieved chunks lose meaning if separated from document context.
Solution: Each chunk should preserve parent artifact, section path, heading, concept identifiers, source, and version.
14.4 Pattern: Agent Brief
Context: Agents need compact guidance.
Problem: Full standards are too large for routine retrieval.
Solution: Provide agent briefs summarizing scope, owned concepts, imports, do/do-not rules, patterns, and examples.
14.5 Pattern: Use Path Navigation
Context: Humans and agents approach the canon with tasks, not just topics.
Problem: A concept index alone does not explain where to start.
Solution: Provide UsePath documents that guide common activities through relevant standards, patterns, profiles, and examples.
14.6 Pattern: Source-Carrying Summary
Context: Summaries are useful but may detach from evidence.
Problem: Unsourced summaries become untrustworthy.
Solution: Summaries SHOULD retain source references, provenance, generation activity, and review state.
14.7 Pattern: Mapping as Linkable Artifact
Context: External standards and internal concepts must stay aligned.
Problem: Mapping notes buried in prose cannot be maintained.
Solution: Represent mappings as first-class artifacts with source concept, target concept, mapping type, scope, confidence, rationale, and version.
14.8 Pattern: Assimilation Folder
Context: New external knowledge bodies must be digested.
Problem: Research notes disappear after standards are updated.
Solution: Each assimilation should produce a folder with source summary, extracted concepts, comparison matrix, mappings, proposed changes, and open questions.
14.9 Pattern: View Not Source
Context: Generated indexes and diagrams are useful.
Problem: Teams edit generated views as if they were canonical source.
Solution: Mark generated views clearly and regenerate them from canonical artifacts.
14.10 Pattern: Retrieval Quality Loop
Context: Agents depend on retrieval.
Problem: Retrieval failures cause hallucination, contradiction, or stale answers.
Solution: Track retrieval queries, expected results, misses, stale hits, duplicate hits, and quality fixes.
15. Information Space Profiles
15.1 Profile Format
An Information Space Profile SHALL declare:
id:
profile_name:
status:
implements:
- InfoTechCanonInformationSpaceModel
target_context:
included_concepts:
required_metadata:
required_indexes:
chunking_rules:
source_of_truth_rules:
mapping_files:
validation_rules:
examples:
known_deviations:
15.2 Seed Profile: InfoTechCanon Repository Profile
Purpose:
Define the expected structure for the info-tech-canon repository.
Required top-level files:
README.md
INTENT.md
SCOPE.md
canon.yaml
Recommended directories:
standards/
patterns/
profiles/
mappings/
assimilation/
schemas/
views/
agent/
examples/
validation/
Required indexes:
by-standard
by-concept
by-pattern
by-profile
by-mapping-target
by-status
use-paths
15.3 Seed Profile: Markdown Infospace Profile
Purpose:
Define a general profile for markdown-first knowledge spaces.
Required concepts:
MarkdownDocument
FrontMatter
Section
Anchor
Link
Backlink
Index
RetrievalUnit
ProvenanceRecord
Recommended front matter:
id:
title:
type:
status:
owner:
created_at:
updated_at:
tags:
related:
sources:
15.4 Seed Profile: Agent-Retrievable Standards Profile
Purpose:
Make standards retrievable and usable by AI agents.
Required artifacts:
standard.md
agent-brief.md
concept index
relationship index
profile index
mapping index
examples
validation rules
Chunking rules:
chunk by major section
preserve heading path
preserve artifact id
preserve concept ids
include summary chunks
exclude generated noise
15.5 Seed Profile: Assimilation Workspace Profile
Purpose:
Define how external bodies of knowledge are analyzed and assimilated.
Required files:
ASSIMILATION.md
source-summary.md
extracted-concepts.yaml
comparison-matrix.md
mappings.yaml
proposed-changes.md
open-questions.md
15.6 Seed Profile: Sharded Wiki Profile
Purpose:
Support federated markdown knowledge spaces where multiple shards attach around shared root entities.
Included concepts:
Shard
ShardRoot
Overlay
RemoteReference
PatchProposal
MergeRequestReference
CachedArtifact
ShardBoundary
Known deviations:
Shard synchronization and merge mechanics are implementation-specific.
15.7 Seed Profile: RAG Corpus Profile
Purpose:
Prepare an information space for retrieval-augmented generation.
Included concepts:
Corpus
RetrievalUnit
Chunk
EmbeddingRecord
SearchIndex
VectorIndex
RetrievalQuery
RetrievalResult
RetrievalEvaluation
Required metadata:
source id
artifact id
section path
chunk id
version
source hash
embedding model
created_at
16. Mapping Model for the Information Space Standard
Mappings relate InfoTechCanon information-space concepts to external standards, frameworks, and tools.
16.1 Mapping Types
Recommended mapping types:
exactMatch
closeMatch
broadMatch
narrowMatch
relatedMatch
conflictMatch
gapMatch
derivedFrom
regulatoryReference
toolEquivalent
16.2 Mapping Record
Example:
id: itc-map:concept-page-to-skos-concept
source_concept: itc-infospace:ConceptPage
target_body: SKOS
target_version: "2009"
target_concept: skos:Concept
mapping_type: relatedMatch
scope:
- knowledge organization and concept documentation
not_valid_for:
- all SKOS semantic constraints
rationale: >
A ConceptPage documents an InfoTechCanon concept, while skos:Concept
represents a conceptual resource in a concept scheme. They are related but not identical:
the page is a documentation artifact, the concept is the meaning being documented.
confidence: medium
status: candidate
owner: InfoTechCanonInformationSpaceModel
16.3 Seed Mapping Targets
The Information Space Model SHOULD maintain mappings to:
SKOS
FAIR principles
PROV-O
Dublin Core
Singapore Framework for Dublin Core Application Profiles
DCAT
RDF / JSON-LD
Markdown / CommonMark
YAML front matter conventions
Git repository concepts
static site generator concepts
Obsidian / wiki-link conventions
Zettelkasten note patterns
DITA topic concepts
schema.org CreativeWork / Dataset
RAG / vector index tool schemas
17. Assimilation Hooks
The Information Space Model SHALL be able to receive new knowledge-organization, metadata, documentation, retrieval, and wiki systems through the InfoTechCanon assimilation process.
17.1 Assimilation Triggers
Assimilation may be triggered by:
new metadata standard
new knowledge organization model
new wiki engine
new markdown convention
new documentation generator
new RAG architecture
new retrieval evaluation method
new citation model
new provenance standard
new agent context-management pattern
17.2 Information Space Assimilation Output
An information-space assimilation SHOULD produce:
source summary
extracted information-space concepts
concept comparison matrix
gap list
conflict list
mapping file
candidate new concepts
candidate relationship changes
candidate pattern changes
candidate profile changes
open questions
17.3 Recommended First Assimilation Candidates
SKOS
FAIR principles
PROV-O
Dublin Core / Singapore Framework
CommonMark / Markdown conventions
Obsidian / wiki-link practice
Zettelkasten note practice
DITA topic architecture
RAG corpus and chunking patterns
static site generator metadata conventions
18. Integration with Other InfoTechCanon Standards
18.1 Core
Information Space uses Core concepts for:
Concept
Standard
Pattern
Profile
Mapping
Assimilation
Version
Conformance
CanonicalOwner
18.2 Tagging
Information Space uses tags for:
topic
status
artifact type
domain
mapping target
retrieval group
18.3 Data
Data treats corpora, indexes, embeddings, and retrieval results as data assets where needed.
18.4 Governance
Governance applies to:
review state
approval
evidence
publication status
deprecation
retention
access policy
18.5 DevSecOps
DevSecOps tracks:
repository changes
build generation
publication pipelines
index generation
release of standards
18.6 Observability
Observability tracks:
retrieval quality
index freshness
broken links
agent usage
search failures
18.7 Security and Access Control
Security and Access Control apply to:
sensitive documents
restricted knowledge
credentials in documentation
agent access to knowledge
index access
retrieval audit
19. Canon Interface Card Usage
Subsystems that implement or produce information-space knowledge SHOULD publish a Canon Interface Card.
Example:
subsystem: markitect-tool
implements:
- InfoTechCanonInformationSpaceModel
- MarkdownInfospaceProfile
produces:
- MarkdownDocument
- FrontMatter
- Index
- RetrievalUnit
- Link
- ValidationResult
consumes:
- StandardDocument
- ConceptPage
- MappingDocument
relations:
- MarkdownDocument chunks_into RetrievalUnit
- RetrievalUnit indexed_by SearchIndex
- Link references KnowledgeArtifact
source_of_truth:
markdown_artifacts: git_repository
known_deviations:
- embedding storage may be external
- generated indexes may be rebuilt from source
20. Retrieval Requirements
The Information Space Model is itself designed for retrieval.
20.1 Required Retrieval Properties
Every major artifact SHOULD provide:
- stable identifier,
- stable title,
- artifact type,
- status,
- owner or steward,
- source references,
- related artifacts,
- headings,
- anchors,
- summary,
- front matter,
- and retrievable sections.
20.2 Agent Brief
A mature Information Space Model SHOULD include an agent-brief.md file with:
purpose
scope
owned concepts
imported concepts
artifact types
front matter rules
chunking rules
retrieval rules
do / do not rules
common mistakes
profile list
mapping list
20.3 Indexes
The information space SHOULD provide indexes by:
artifact
concept
standard
pattern
profile
mapping
source
status
owner
tag
external reference
retrieval unit
use path
21. Conformance Levels
21.1 Reference-Conformant
A repository or document set is reference-conformant if it uses Information Space terminology consistently but does not implement structured metadata or validation rules.
21.2 Metadata-Conformant
A repository or document set is metadata-conformant if major artifacts have structured metadata and stable identifiers.
21.3 Link-Conformant
A repository or document set is link-conformant if internal links, backlinks, citations, and references are represented and checkable.
21.4 Retrieval-Conformant
A repository or document set is retrieval-conformant if artifacts are chunked, indexed, and retrievable with stable source context.
21.5 Provenance-Conformant
A repository or document set is provenance-conformant if artifacts and important changes preserve source, activity, agent, and review records.
21.6 Profile-Conformant
A repository or document set is profile-conformant if it implements a declared Information Space Profile and passes its validation rules.
21.7 Assimilation-Conformant
A repository or document set is assimilation-conformant if it can represent assimilation workspaces and produce mappings, gaps, conflicts, and proposed changes.
22. Validation Rules
Initial validation rules:
VAL-INFOSPACE-001: Every major KnowledgeArtifact SHOULD have a stable id.
VAL-INFOSPACE-002: Every StandardDocument SHOULD declare status, version, owner, and scope.
VAL-INFOSPACE-003: Every ConceptPage SHOULD define exactly one primary concept.
VAL-INFOSPACE-004: Generated views SHOULD be marked as generated or derived.
VAL-INFOSPACE-005: Internal links SHOULD resolve to existing artifacts or anchors.
VAL-INFOSPACE-006: External references SHOULD include source, access date or source version where relevant.
VAL-INFOSPACE-007: RetrievalUnit SHOULD preserve artifact id, section path, version, and source context.
VAL-INFOSPACE-008: EmbeddingRecord SHOULD reference source hash, embedding model, and chunking strategy.
VAL-INFOSPACE-009: Summary SHOULD reference the artifact or retrieval unit it summarizes.
VAL-INFOSPACE-010: AgentBrief SHOULD be derived from or reviewed against the full artifact.
VAL-INFOSPACE-011: AssimilationReport SHOULD include source summary, extracted concepts, comparison matrix, mappings, proposed changes, and open questions.
VAL-INFOSPACE-012: MappingDocument SHOULD declare source concept, target body, target concept, mapping type, scope, confidence, and rationale.
VAL-INFOSPACE-013: Deprecated artifacts SHOULD reference replacements where available.
VAL-INFOSPACE-014: Orphan artifacts SHOULD be reviewed for indexing, linking, archiving, or deletion.
VAL-INFOSPACE-015: Conflicting definitions SHOULD create review work or mapping notes.
VAL-INFOSPACE-016: Sensitive knowledge artifacts SHOULD reference Access Control, Security, Data, or Governance constraints where relevant.
VAL-INFOSPACE-017: Tags MUST NOT replace stable identifiers, links, mappings, or metadata.
VAL-INFOSPACE-018: Profiles MUST NOT redefine canonical concepts. They may constrain them.
23. Anti-Patterns
23.1 Markdown Pile
A folder full of Markdown files without stable IDs, indexes, links, or metadata.
23.2 Chunk Soup
Chunks created for retrieval without preserving document context, heading path, source, or version.
23.3 Summary Without Source
Summaries detached from the source artifacts they summarize.
23.4 Link Rot Inside the Repo
Internal links break because anchors and file paths are not validated.
23.5 View as Source
Generated indexes or diagrams are edited manually and diverge from canonical artifacts.
23.6 Embedding Without Provenance
Embeddings are stored without model, source hash, chunking strategy, or creation time.
23.7 Concept Drift by Duplication
The same concept is defined in multiple places without canonical ownership.
23.8 Agent Brief as Replacement
Agents use compact briefs that are stale or inconsistent with full standards.
23.9 Retrieval Without Evaluation
Search and RAG are used without tests for relevance, freshness, and citation correctness.
23.10 External Standard Copy-Paste
External standards are copied into the information space without mapping, assimilation, or source boundaries.
24. Initial Repository Placement
Recommended repository layout:
info-tech-canon/
standards/
information-space/
InfoTechCanonInformationSpaceModel.md
agent-brief.md
concepts/
relationships/
patterns/
profiles/
mappings/
assimilation/
examples/
validation/
Seed files:
standards/information-space/InfoTechCanonInformationSpaceModel.md
standards/information-space/agent-brief.md
standards/information-space/concepts/information-space.md
standards/information-space/concepts/knowledge-artifact.md
standards/information-space/concepts/retrieval-unit.md
standards/information-space/concepts/chunk.md
standards/information-space/concepts/index.md
standards/information-space/concepts/agent-brief.md
standards/information-space/concepts/provenance-record.md
standards/information-space/patterns/markdown-with-structured-front-matter.md
standards/information-space/patterns/concept-page-per-canonical-concept.md
standards/information-space/patterns/chunk-with-parent-context.md
standards/information-space/patterns/agent-brief.md
standards/information-space/profiles/infotechcanon-repository-profile.md
standards/information-space/profiles/markdown-infospace-profile.md
standards/information-space/profiles/agent-retrievable-standards-profile.md
standards/information-space/profiles/assimilation-workspace-profile.md
standards/information-space/profiles/rag-corpus-profile.md
standards/information-space/mappings/skos.yaml
standards/information-space/mappings/fair.yaml
standards/information-space/mappings/prov-o.yaml
standards/information-space/mappings/dublin-core.yaml
25. Roadmap
Phase 1: Seed Stabilization
- Establish this standard as
InfoTechCanonInformationSpaceModel. - Add seed concepts, relationship vocabulary, patterns, and profiles.
- Define validation rules.
- Align with Core, Tagging, Data, Governance, DevSecOps, Observability, Security, and Access Control.
Phase 2: First Assimilations
Recommended first assimilations:
SKOS
FAIR principles
PROV-O
Dublin Core / Singapore Framework
CommonMark / Markdown conventions
Obsidian / wiki-link practice
Zettelkasten note practice
DITA topic architecture
RAG corpus and chunking patterns
Phase 3: Profile Maturation
- Mature InfoTechCanon Repository Profile.
- Mature Markdown Infospace Profile.
- Mature Agent-Retrievable Standards Profile.
- Mature Assimilation Workspace Profile.
- Mature Sharded Wiki Profile.
- Mature RAG Corpus Profile.
Phase 4: Tooling Integration
- Generate concept indexes.
- Generate agent briefs.
- Generate chunk manifests.
- Generate machine-readable YAML/JSON exports.
- Add validation scripts.
- Add broken-link checks.
- Add stale-content checks.
- Add retrieval-quality tests.
- Integrate with markitect-tool, kontextual-engine, shard-wiki, llm-connect, and phase-memory.
Phase 5: Knowledge Intelligence Loop
- Track retrieval failures.
- Track stale concepts.
- Track conflicting definitions.
- Track missing mappings.
- Track assimilation backlog.
- Generate improvement tasks.
- Use agent feedback to refine chunks, briefs, indexes, and profiles.
26. Summary
The InfoTechCanon Information Space Model is the seed standard for representing markdown-first, human-readable, machine-retrievable, provenance-aware, interconnected knowledge spaces.
Its most important commitments are:
Separate domain meaning from knowledge-space packaging.
Treat documents, sections, chunks, retrieval units, links, citations, indexes,
summaries, agent briefs, provenance, and mappings as first-class artifacts.
Make markdown useful for both humans and agents through structured metadata,
stable identifiers, chunking rules, source references, and validation.
Map to SKOS, FAIR, PROV-O, Dublin Core, Markdown, and RAG practices
without surrendering internal semantic autonomy.
Use profiles to make the model practical for the InfoTechCanon repository,
markdown infospaces, sharded wikis, assimilation workspaces, and agent retrieval.
This makes the Information Space Model the structural substrate for turning InfoTechCanon from a collection of documents into a living, reusable, agent-operable knowledge system.