# InfoTechCanon Information Space Model **Short Name:** `ITC-INFOSPACE` **Document Status:** Seed Standard Release Candidate 1 **Version:** RC1-seed **Date:** 2026-05-23 **Repository Context:** `info-tech-canon` **Document Type:** InfoTechCanon Domain Standard **Intended Audience:** Knowledge-system builders, markdown-infospace maintainers, standards authors, AI-agent tool builders, documentation architects, information architects, ontology/taxonomy maintainers, software architects, platform builders, and retrieval-system designers. --- # 1. Purpose The **InfoTechCanon Information Space Model** defines a canonical seed model for representing markdown-first, human-readable, machine-retrievable, provenance-aware, interconnected information spaces. It exists to provide the structural and semantic foundation for using InfoTechCanon as an evolving reference body for humans, agents, tools, and services. This standard owns the concepts required to make a body of knowledge: - navigable, - retrievable, - reusable, - linkable, - citable, - chunkable, - versionable, - mappable, - provenance-aware, - profile-aware, - and suitable for both human reading and agentic use. It provides a canonical vocabulary for: - information spaces, - knowledge artifacts, - markdown documents, - concept pages, - standard documents, - chunks, - sections, - anchors, - identifiers, - links, - backlinks, - citations, - references, - indexes, - summaries, - agent briefs, - retrieval units, - embeddings, - metadata, - provenance, - mappings, - assimilation records, - views, - navigation structures, - and knowledge-quality signals. --- # 2. Position in InfoTechCanon The Information Space Model is a **domain standard** within InfoTechCanon. It should serve as the structural substrate for how all other standards are stored, retrieved, navigated, linked, mapped, and reused. ```text InfoTechCanon ├── InfoTechCanonCore ├── InfoTechCanonInformationSpaceModel <-- this standard ├── InfoTechCanonLandscapeModel ├── InfoTechCanonOrganizationModel ├── InfoTechCanonGovernanceModel ├── InfoTechCanonTaskModel ├── InfoTechCanonTaggingStandard ├── InfoTechCanonAccessControlModel ├── InfoTechCanonSecurityModel ├── InfoTechCanonDataModel ├── InfoTechCanonDevSecOpsModel ├── InfoTechCanonNetworkModel ├── InfoTechCanonObservabilityModel ├── InfoTechCanonPatternLanguage └── Application Profiles ``` The dependency role is: ```text Domain standards define meaning. Information Space defines how meaning is packaged, linked, indexed, retrieved, cited, and reused. ``` --- # 3. Boundary with Adjacent Standards ## 3.1 Boundary with Core InfoTechCanonCore should own generic canon mechanisms: ```text Concept Standard Pattern Profile Mapping Assimilation Versioning Conformance Canonical Owner ``` The Information Space Model owns the storage, retrieval, navigation, chunking, and documentation structures used to operationalize those mechanisms. ## 3.2 Boundary with Tagging The Tagging Standard owns tag identity, schemes, namespaces, assignments, and validation. The Information Space Model uses tags for navigation and retrieval but does not define tag semantics. ## 3.3 Boundary with Data The Data Model owns datasets, schemas, data products, lineage, data contracts, and data quality. The Information Space Model owns knowledge artifacts, documents, chunks, indexes, and retrieval units. A corpus of Markdown files may be treated as data by the Data Model, but the information-space semantics are owned here. ## 3.4 Boundary with Governance Governance owns policies, controls, decisions, exceptions, evidence, and assurance. The Information Space Model owns how governance documents, evidence references, citations, and versioned documentation artifacts are structured and retrieved. ## 3.5 Boundary with DevSecOps DevSecOps owns source repositories, commits, pipelines, releases, deployments, SBOMs, and attestations. The Information Space Model may be implemented in Git and linked to DevSecOps records, but it owns the knowledge-space structure. ## 3.6 Boundary with Observability Observability owns telemetry, signals, metrics, logs, traces, alerts, and operational evidence. The Information Space Model owns knowledge artifacts and retrieval structures, not runtime telemetry. --- # 4. Research Basis and External Alignment This seed standard draws on several knowledge organization and metadata traditions. ## 4.1 SKOS SKOS defines a common data model for sharing and linking knowledge organization systems such as thesauri, taxonomies, classification schemes, and subject-heading systems. It provides useful concepts such as concept schemes, preferred labels, alternative labels, broader/narrower relations, related relations, and mapping relations. ## 4.2 FAIR Principles The FAIR principles emphasize that digital assets should be Findable, Accessible, Interoperable, and Reusable. They are especially relevant because InfoTechCanon must be useful to both humans and machines. ## 4.3 PROV-O PROV-O models provenance through entities, activities, and agents. This is central for tracking how knowledge artifacts, mappings, assimilations, and standards evolved. ## 4.4 Dublin Core and Application Profiles Dublin Core and the Singapore Framework for application profiles distinguish reusable metadata vocabularies from application-specific profiles. This directly supports InfoTechCanon’s distinction between canonical concepts and concrete profiles. ## 4.5 Zettelkasten, Wikis, and Hypertext Zettelkasten and wiki traditions emphasize durable notes, links, backlinks, local context, reuse, and emergent structure. They are useful for the human side of markdown-first information spaces. ## 4.6 Documentation Systems and Static Site Generators Modern documentation systems emphasize stable headings, cross-links, front matter, sidebars, indexes, search, versioned docs, and generated navigation. These are practical implementation targets. ## 4.7 Retrieval-Augmented Generation RAG systems require chunking, metadata, embeddings, summaries, stable identifiers, source references, and retrieval-quality evaluation. The Information Space Model should make retrieval a first-class design concern rather than an afterthought. --- # 5. Seed Standard Design Stance This standard is a **seed standard**, not a complete CMS, ontology, or retrieval-engine specification. It shall: 1. define canonical information-space semantics, 2. remain markdown-first, 3. support human navigation and agent retrieval, 4. support stable identifiers and anchors, 5. support chunking and retrieval units, 6. support citations, references, provenance, and versioning, 7. support indexes, views, summaries, and agent briefs, 8. support mappings and assimilation records, 9. map to external standards without becoming subordinate to them, 10. support future integration with markdown infobase tools and services. --- # 6. Scope ## 6.1 In Scope This standard covers canonical representation of: - information spaces, - knowledge bases, - infospaces, - repositories as knowledge spaces, - markdown documents, - concept pages, - standard documents, - pattern documents, - profile documents, - mapping documents, - assimilation reports, - decision records, - sections, - headings, - anchors, - stable identifiers, - chunks, - retrieval units, - summaries, - agent briefs, - front matter, - metadata records, - links, - backlinks, - citations, - references, - external references, - indexes, - navigation views, - document collections, - shard references, - source references, - provenance records, - version records, - quality signals, - embedding records, - retrieval queries, - retrieval results, - and reuse contexts. ## 6.2 Out of Scope This standard does not fully define: - all semantic-web ontology modeling, - all RDF/OWL reasoning, - full CMS implementation, - full search-engine ranking, - all embedding algorithms, - full vector-database implementation, - all document authoring standards, - all bibliography formats, - all legal citation systems, - all software repository structures, - or every markdown dialect. Those may be mapped, assimilated, profiled, or handled by adjacent standards. --- # 7. Normative Language The following terms are used normatively: - **SHALL** indicates a mandatory rule for conformance. - **SHOULD** indicates a recommended practice. - **MAY** indicates an optional capability. - **MUST NOT** indicates a prohibited practice. - **SEED** marks a concept defined provisionally here but open to later refinement. - **EXTRACT** marks a concept that may later move to a more specialized standard. --- # 8. Core Principles ## 8.1 Markdown-First, Not Markdown-Only The canonical working format SHOULD be Markdown with structured metadata, but the model SHOULD allow export to JSON, YAML, RDF, JSON-LD, graph databases, search indexes, and vector stores. ## 8.2 Human-Readable and Machine-Retrievable Every important artifact SHOULD be readable by humans and retrievable by machines. ## 8.3 Stable Identity Is Mandatory for Reuse Artifacts, sections, concepts, mappings, profiles, and retrieval units SHOULD have stable identifiers. ## 8.4 Chunking Is a Design Concern Documents SHOULD be structured so that retrieval chunks preserve meaning, context, and source traceability. ## 8.5 Links Are First-Class Links, backlinks, references, citations, and mappings SHOULD be explicit and queryable where possible. ## 8.6 Provenance Is First-Class Knowledge artifacts SHOULD preserve source, author, generator, change activity, review state, and influence where meaningful. ## 8.7 Views Are Not the Model Indexes, navigation pages, diagrams, generated views, and dashboards are views over the information space, not the underlying knowledge itself. ## 8.8 Retrieval Must Be Evaluated Useful information spaces SHOULD support retrieval-quality checks, stale-content detection, broken-link detection, and duplicate/conflict detection. ## 8.9 External Standards Are Mapped, Not Obeyed The Information Space Model MAY map to SKOS, FAIR, PROV-O, Dublin Core, DCAT, Markdown, DITA, RDF, JSON-LD, static-site generators, and RAG tooling patterns. It MUST NOT subordinate its internal semantics to any single external model. --- # 9. Canonical Seed Metadata Every information-space artifact SHOULD support structured metadata. Recommended front matter: ```yaml --- id: itc-infospace:KnowledgeArtifact type: concept standard: InfoTechCanonInformationSpaceModel standard_version: RC1-seed status: candidate canonical_owner: InfoTechCanonInformationSpaceModel preferred_label: Knowledge Artifact related: - itc-infospace:InformationSpace - itc-infospace:Document - itc-infospace:RetrievalUnit - itc-infospace:ProvenanceRecord mappings: - itc-map:knowledge-artifact-to-prov-entity --- ``` Recommended artifact statuses: ```text idea draft candidate release-candidate adopted stable deprecated retired ``` Recommended content statuses: ```text raw captured draft reviewed candidate canonical deprecated superseded archived ``` --- # 10. Root Information Space Taxonomy ```text InformationSpaceEntity ├── SpaceEntity │ ├── InformationSpace │ ├── KnowledgeBase │ ├── Infospace │ ├── RepositorySpace │ ├── Shard │ ├── Collection │ └── Corpus ├── ArtifactEntity │ ├── KnowledgeArtifact │ ├── Document │ ├── MarkdownDocument │ ├── ConceptPage │ ├── StandardDocument │ ├── PatternDocument │ ├── ProfileDocument │ ├── MappingDocument │ ├── AssimilationReport │ ├── DecisionRecord │ └── AgentBrief ├── StructureEntity │ ├── Section │ ├── Heading │ ├── Anchor │ ├── Block │ ├── Chunk │ ├── RetrievalUnit │ ├── Summary │ └── Excerpt ├── LinkEntity │ ├── Link │ ├── Backlink │ ├── CrossReference │ ├── Citation │ ├── SourceReference │ ├── ExternalReference │ ├── MappingReference │ └── DependencyReference ├── MetadataEntity │ ├── FrontMatter │ ├── MetadataRecord │ ├── Identifier │ ├── Namespace │ ├── Label │ ├── Alias │ ├── Status │ └── VersionRecord ├── RetrievalEntity │ ├── Index │ ├── SearchIndex │ ├── VectorIndex │ ├── EmbeddingRecord │ ├── RetrievalQuery │ ├── RetrievalResult │ ├── RetrievalContext │ └── RetrievalEvaluation ├── ProvenanceEntity │ ├── ProvenanceRecord │ ├── Source │ ├── Activity │ ├── AgentReference │ ├── Generation │ ├── Revision │ ├── Influence │ └── ReviewRecord ├── ViewEntity │ ├── NavigationView │ ├── TopicIndex │ ├── ConceptIndex │ ├── RelationshipIndex │ ├── MapView │ ├── GraphView │ └── UsePath └── QualityEntity ├── BrokenLink ├── DuplicateContent ├── StaleContent ├── ConflictingDefinition ├── MissingMetadata ├── LowRetrievalQuality └── OrphanArtifact ``` --- # 11. Core Concepts ## 11.1 InformationSpace An **InformationSpace** is a bounded, navigable, retrievable, and evolving body of knowledge. Examples: ```text InfoTechCanon repository project wiki standards library markdown knowledge base research corpus documentation space agent-readable context repository ``` --- ## 11.2 KnowledgeBase A **KnowledgeBase** is an information space organized to preserve, retrieve, and reuse knowledge. --- ## 11.3 Infospace An **Infospace** is a structured knowledge environment optimized for navigation, recombination, retrieval, and reuse. In InfoTechCanon, an infospace is expected to be markdown-first and machine-indexable. --- ## 11.4 RepositorySpace A **RepositorySpace** is an information space backed by a source repository. --- ## 11.5 Shard A **Shard** is an independently maintained portion of an information space that can be attached, federated, cached, or overlaid with other shards. This concept supports shard-wiki-like federation. --- ## 11.6 Collection A **Collection** is a curated group of knowledge artifacts. --- ## 11.7 Corpus A **Corpus** is a body of documents or artifacts used for search, retrieval, analysis, or training-like reference. --- ## 11.8 KnowledgeArtifact A **KnowledgeArtifact** is any identifiable artifact that carries reusable knowledge. Examples: ```text standard document concept page pattern document profile mapping file assimilation report decision record agent brief example schema diagram ``` --- ## 11.9 Document A **Document** is a knowledge artifact primarily represented as ordered textual or structured content. --- ## 11.10 MarkdownDocument A **MarkdownDocument** is a document represented in Markdown or a Markdown-compatible dialect. --- ## 11.11 ConceptPage A **ConceptPage** is a document or section that defines and explains one canonical concept. --- ## 11.12 StandardDocument A **StandardDocument** is a document defining a standard, its scope, concepts, relationships, validation rules, mappings, and profiles. --- ## 11.13 PatternDocument A **PatternDocument** is a document describing a recurring problem, forces, solution, resulting context, variants, and related patterns. --- ## 11.14 ProfileDocument A **ProfileDocument** is a document defining constraints and implementation guidance for a specific context. --- ## 11.15 MappingDocument A **MappingDocument** defines mappings between InfoTechCanon concepts and external standards, regulations, tools, vocabularies, or product schemas. --- ## 11.16 AssimilationReport An **AssimilationReport** documents the analysis of an external body of knowledge, including extracted concepts, gaps, conflicts, mappings, and proposed canon changes. --- ## 11.17 DecisionRecord A **DecisionRecord** records a decision, context, options, rationale, consequences, and review trigger. --- ## 11.18 AgentBrief An **AgentBrief** is a compact, retrieval-optimized document summarizing a standard, profile, pattern, or subsystem for AI-agent use. Recommended content: ```text purpose scope owned concepts imported concepts do / do not rules common mistakes minimal examples mapping targets validation hints ``` --- ## 11.19 Section A **Section** is a named portion of a document. --- ## 11.20 Heading A **Heading** is a section title used for human navigation and machine chunking. --- ## 11.21 Anchor An **Anchor** is a stable target within an artifact. Anchors SHOULD remain stable across non-breaking edits. --- ## 11.22 Block A **Block** is a structurally meaningful piece of content such as a paragraph, list, table, code block, callout, or diagram block. --- ## 11.23 Chunk A **Chunk** is a segment of content prepared for retrieval, indexing, embedding, citation, or context assembly. Canonical rule: ```text A Chunk SHOULD preserve enough context to be meaningful when retrieved independently. ``` --- ## 11.24 RetrievalUnit A **RetrievalUnit** is a retrievable unit of knowledge. A retrieval unit may be: ```text document section chunk concept page pattern profile mapping example agent brief ``` --- ## 11.25 Summary A **Summary** is a compressed representation of an artifact or retrieval unit. --- ## 11.26 Excerpt An **Excerpt** is a quoted or extracted part of a source. Excerpts SHOULD preserve source reference and usage constraints. --- ## 11.27 Link A **Link** is a directed reference from one artifact or section to another. --- ## 11.28 Backlink A **Backlink** is an inverse view of a link. --- ## 11.29 CrossReference A **CrossReference** is a link between related artifacts, concepts, patterns, profiles, or sections. --- ## 11.30 Citation A **Citation** is a reference to a source used to support a claim, definition, mapping, or statement. --- ## 11.31 SourceReference A **SourceReference** identifies the source from which information was derived, quoted, summarized, mapped, or assimilated. --- ## 11.32 ExternalReference An **ExternalReference** points to a source outside the information space. --- ## 11.33 MappingReference A **MappingReference** points from an artifact to a mapping record or external concept mapping. --- ## 11.34 DependencyReference A **DependencyReference** indicates that one artifact depends on another for meaning, validity, or interpretation. --- ## 11.35 FrontMatter **FrontMatter** is structured metadata embedded at the beginning of a Markdown document. --- ## 11.36 MetadataRecord A **MetadataRecord** is structured data describing an artifact, section, chunk, index entry, or retrieval unit. --- ## 11.37 Identifier An **Identifier** is a stable reference string for an artifact or entity. Recommended properties: ```text stable unique within namespace human-readable when practical machine-parseable version-aware where needed ``` --- ## 11.38 Namespace A **Namespace** is a naming scope used to prevent identifier collisions. Examples: ```text itc-core itc-land itc-org itc-gov itc-task itc-tag itc-access itc-sec itc-data itc-devsecops itc-net itc-obs itc-infospace ``` --- ## 11.39 Label A **Label** is a human-readable name for an artifact or concept. --- ## 11.40 Alias An **Alias** is an alternative label or name. --- ## 11.41 Status A **Status** indicates lifecycle or review state. --- ## 11.42 VersionRecord A **VersionRecord** records artifact version, change, compatibility, and supersession information. --- ## 11.43 Index An **Index** is a structured access path into an information space. Examples: ```text concept index standard index pattern index mapping index profile index source index status index external standard index ``` --- ## 11.44 SearchIndex A **SearchIndex** supports lexical or semantic search. --- ## 11.45 VectorIndex A **VectorIndex** supports embedding-based retrieval. --- ## 11.46 EmbeddingRecord An **EmbeddingRecord** stores or references an embedding for a retrieval unit. Recommended attributes: ```yaml retrieval_unit: embedding_model: embedding_version: created_at: source_hash: chunking_strategy: ``` --- ## 11.47 RetrievalQuery A **RetrievalQuery** is a query used to find relevant artifacts or retrieval units. --- ## 11.48 RetrievalResult A **RetrievalResult** is the result of a retrieval query. It SHOULD preserve rank, source, retrieval unit, score, and snippet/context where possible. --- ## 11.49 RetrievalContext A **RetrievalContext** is the assembled set of retrieval results, summaries, and metadata used for human or agent work. --- ## 11.50 RetrievalEvaluation A **RetrievalEvaluation** assesses retrieval quality. Examples: ```text relevance coverage freshness precision recall source diversity citation correctness staleness ``` --- ## 11.51 ProvenanceRecord A **ProvenanceRecord** documents how an artifact, concept, mapping, or retrieval unit came to exist or change. --- ## 11.52 Source A **Source** is an origin of information. Examples: ```text external standard uploaded document web page internal decision agent-generated draft manual authoring assimilation report ``` --- ## 11.53 Activity An **Activity** is an action that generated, modified, reviewed, mapped, assimilated, or published an artifact. --- ## 11.54 AgentReference An **AgentReference** points to a human, software agent, organization, or tool responsible for or involved in an activity. --- ## 11.55 Generation A **Generation** records creation of an artifact or retrieval unit. --- ## 11.56 Revision A **Revision** records modification of an artifact. --- ## 11.57 Influence An **Influence** records that one source, artifact, activity, or agent influenced another. --- ## 11.58 ReviewRecord A **ReviewRecord** records review activity, reviewer, outcome, and comments. --- ## 11.59 NavigationView A **NavigationView** is a human-oriented view for browsing an information space. --- ## 11.60 TopicIndex A **TopicIndex** organizes artifacts by topic. --- ## 11.61 ConceptIndex A **ConceptIndex** organizes concept pages or concept definitions. --- ## 11.62 RelationshipIndex A **RelationshipIndex** organizes relationships among artifacts or concepts. --- ## 11.63 MapView A **MapView** visualizes or lists relationships, dependencies, domains, or concept mappings. --- ## 11.64 GraphView A **GraphView** represents the information space as nodes and edges. --- ## 11.65 UsePath A **UsePath** is a guided path through the information space for a common user intent. Examples: ```text I want to model a new subsystem. I want to map an external standard. I want to create a task-tag profile. I want to onboard an agent. I want to check conformance. ``` --- ## 11.66 BrokenLink A **BrokenLink** is a link whose target cannot be resolved. --- ## 11.67 DuplicateContent **DuplicateContent** is overlapping or repeated content that may create drift. --- ## 11.68 StaleContent **StaleContent** is content whose age, supersession state, or source drift reduces trust. --- ## 11.69 ConflictingDefinition A **ConflictingDefinition** is a contradiction between definitions or concept usage. --- ## 11.70 MissingMetadata **MissingMetadata** is required or expected metadata that is absent. --- ## 11.71 LowRetrievalQuality **LowRetrievalQuality** indicates poor retrieval performance for a query, use path, or artifact set. --- ## 11.72 OrphanArtifact An **OrphanArtifact** is an artifact with no incoming links, no index membership, no owner, or no retrieval path. --- # 12. Core Relationship Vocabulary Recommended root relationship types: ```text contains part_of defines describes summarizes references cites links_to backlinks_to maps_to depends_on derived_from generated_by revised_by reviewed_by influenced_by supersedes deprecated_by chunks_into indexed_by retrieved_by embedded_as has_view belongs_to_space belongs_to_collection evidenced_by ``` Relationship records SHOULD support: ```yaml id: relationship_type: source_entity: target_entity: scope: valid_from: valid_to: source_system: confidence: evidence: rationale: ``` --- # 13. Information Space State Models ## 13.1 Artifact States ```text raw captured draft reviewed candidate canonical deprecated superseded archived deleted ``` ## 13.2 Retrieval Unit States ```text active stale invalidated deprecated superseded excluded needs_rechunking ``` ## 13.3 Link States ```text active broken redirected deprecated ambiguous external_unverified ``` ## 13.4 Review States ```text not_reviewed under_review reviewed changes_requested approved rejected needs_revalidation ``` ## 13.5 Index States ```text fresh stale partial rebuilding failed deprecated ``` --- # 14. Information Space Patterns ## 14.1 Pattern: Markdown with Structured Front Matter **Context:** Humans need readable documents, while tools need structured metadata. **Problem:** Pure prose is hard to index and validate; pure data is hard to author. **Solution:** Use Markdown for content and front matter for structured metadata. --- ## 14.2 Pattern: Concept Page per Canonical Concept **Context:** Concepts need stable definitions. **Problem:** Definitions drift when scattered across documents. **Solution:** Create one canonical concept page per important concept and link other documents to it. --- ## 14.3 Pattern: Chunk with Parent Context **Context:** Agents retrieve chunks of documents. **Problem:** Retrieved chunks lose meaning if separated from document context. **Solution:** Each chunk should preserve parent artifact, section path, heading, concept identifiers, source, and version. --- ## 14.4 Pattern: Agent Brief **Context:** Agents need compact guidance. **Problem:** Full standards are too large for routine retrieval. **Solution:** Provide agent briefs summarizing scope, owned concepts, imports, do/do-not rules, patterns, and examples. --- ## 14.5 Pattern: Use Path Navigation **Context:** Humans and agents approach the canon with tasks, not just topics. **Problem:** A concept index alone does not explain where to start. **Solution:** Provide UsePath documents that guide common activities through relevant standards, patterns, profiles, and examples. --- ## 14.6 Pattern: Source-Carrying Summary **Context:** Summaries are useful but may detach from evidence. **Problem:** Unsourced summaries become untrustworthy. **Solution:** Summaries SHOULD retain source references, provenance, generation activity, and review state. --- ## 14.7 Pattern: Mapping as Linkable Artifact **Context:** External standards and internal concepts must stay aligned. **Problem:** Mapping notes buried in prose cannot be maintained. **Solution:** Represent mappings as first-class artifacts with source concept, target concept, mapping type, scope, confidence, rationale, and version. --- ## 14.8 Pattern: Assimilation Folder **Context:** New external knowledge bodies must be digested. **Problem:** Research notes disappear after standards are updated. **Solution:** Each assimilation should produce a folder with source summary, extracted concepts, comparison matrix, mappings, proposed changes, and open questions. --- ## 14.9 Pattern: View Not Source **Context:** Generated indexes and diagrams are useful. **Problem:** Teams edit generated views as if they were canonical source. **Solution:** Mark generated views clearly and regenerate them from canonical artifacts. --- ## 14.10 Pattern: Retrieval Quality Loop **Context:** Agents depend on retrieval. **Problem:** Retrieval failures cause hallucination, contradiction, or stale answers. **Solution:** Track retrieval queries, expected results, misses, stale hits, duplicate hits, and quality fixes. --- # 15. Information Space Profiles ## 15.1 Profile Format An Information Space Profile SHALL declare: ```yaml id: profile_name: status: implements: - InfoTechCanonInformationSpaceModel target_context: included_concepts: required_metadata: required_indexes: chunking_rules: source_of_truth_rules: mapping_files: validation_rules: examples: known_deviations: ``` --- ## 15.2 Seed Profile: InfoTechCanon Repository Profile Purpose: ```text Define the expected structure for the info-tech-canon repository. ``` Required top-level files: ```text README.md INTENT.md SCOPE.md canon.yaml ``` Recommended directories: ```text standards/ patterns/ profiles/ mappings/ assimilation/ schemas/ views/ agent/ examples/ validation/ ``` Required indexes: ```text by-standard by-concept by-pattern by-profile by-mapping-target by-status use-paths ``` --- ## 15.3 Seed Profile: Markdown Infospace Profile Purpose: ```text Define a general profile for markdown-first knowledge spaces. ``` Required concepts: ```text MarkdownDocument FrontMatter Section Anchor Link Backlink Index RetrievalUnit ProvenanceRecord ``` Recommended front matter: ```yaml id: title: type: status: owner: created_at: updated_at: tags: related: sources: ``` --- ## 15.4 Seed Profile: Agent-Retrievable Standards Profile Purpose: ```text Make standards retrievable and usable by AI agents. ``` Required artifacts: ```text standard.md agent-brief.md concept index relationship index profile index mapping index examples validation rules ``` Chunking rules: ```text chunk by major section preserve heading path preserve artifact id preserve concept ids include summary chunks exclude generated noise ``` --- ## 15.5 Seed Profile: Assimilation Workspace Profile Purpose: ```text Define how external bodies of knowledge are analyzed and assimilated. ``` Required files: ```text ASSIMILATION.md source-summary.md extracted-concepts.yaml comparison-matrix.md mappings.yaml proposed-changes.md open-questions.md ``` --- ## 15.6 Seed Profile: Sharded Wiki Profile Purpose: ```text Support federated markdown knowledge spaces where multiple shards attach around shared root entities. ``` Included concepts: ```text Shard ShardRoot Overlay RemoteReference PatchProposal MergeRequestReference CachedArtifact ShardBoundary ``` Known deviations: ```text Shard synchronization and merge mechanics are implementation-specific. ``` --- ## 15.7 Seed Profile: RAG Corpus Profile Purpose: ```text Prepare an information space for retrieval-augmented generation. ``` Included concepts: ```text Corpus RetrievalUnit Chunk EmbeddingRecord SearchIndex VectorIndex RetrievalQuery RetrievalResult RetrievalEvaluation ``` Required metadata: ```text source id artifact id section path chunk id version source hash embedding model created_at ``` --- # 16. Mapping Model for the Information Space Standard Mappings relate InfoTechCanon information-space concepts to external standards, frameworks, and tools. ## 16.1 Mapping Types Recommended mapping types: ```text exactMatch closeMatch broadMatch narrowMatch relatedMatch conflictMatch gapMatch derivedFrom regulatoryReference toolEquivalent ``` ## 16.2 Mapping Record Example: ```yaml id: itc-map:concept-page-to-skos-concept source_concept: itc-infospace:ConceptPage target_body: SKOS target_version: "2009" target_concept: skos:Concept mapping_type: relatedMatch scope: - knowledge organization and concept documentation not_valid_for: - all SKOS semantic constraints rationale: > A ConceptPage documents an InfoTechCanon concept, while skos:Concept represents a conceptual resource in a concept scheme. They are related but not identical: the page is a documentation artifact, the concept is the meaning being documented. confidence: medium status: candidate owner: InfoTechCanonInformationSpaceModel ``` ## 16.3 Seed Mapping Targets The Information Space Model SHOULD maintain mappings to: ```text SKOS FAIR principles PROV-O Dublin Core Singapore Framework for Dublin Core Application Profiles DCAT RDF / JSON-LD Markdown / CommonMark YAML front matter conventions Git repository concepts static site generator concepts Obsidian / wiki-link conventions Zettelkasten note patterns DITA topic concepts schema.org CreativeWork / Dataset RAG / vector index tool schemas ``` --- # 17. Assimilation Hooks The Information Space Model SHALL be able to receive new knowledge-organization, metadata, documentation, retrieval, and wiki systems through the InfoTechCanon assimilation process. ## 17.1 Assimilation Triggers Assimilation may be triggered by: ```text new metadata standard new knowledge organization model new wiki engine new markdown convention new documentation generator new RAG architecture new retrieval evaluation method new citation model new provenance standard new agent context-management pattern ``` ## 17.2 Information Space Assimilation Output An information-space assimilation SHOULD produce: ```text source summary extracted information-space concepts concept comparison matrix gap list conflict list mapping file candidate new concepts candidate relationship changes candidate pattern changes candidate profile changes open questions ``` ## 17.3 Recommended First Assimilation Candidates ```text SKOS FAIR principles PROV-O Dublin Core / Singapore Framework CommonMark / Markdown conventions Obsidian / wiki-link practice Zettelkasten note practice DITA topic architecture RAG corpus and chunking patterns static site generator metadata conventions ``` --- # 18. Integration with Other InfoTechCanon Standards ## 18.1 Core Information Space uses Core concepts for: ```text Concept Standard Pattern Profile Mapping Assimilation Version Conformance CanonicalOwner ``` ## 18.2 Tagging Information Space uses tags for: ```text topic status artifact type domain mapping target retrieval group ``` ## 18.3 Data Data treats corpora, indexes, embeddings, and retrieval results as data assets where needed. ## 18.4 Governance Governance applies to: ```text review state approval evidence publication status deprecation retention access policy ``` ## 18.5 DevSecOps DevSecOps tracks: ```text repository changes build generation publication pipelines index generation release of standards ``` ## 18.6 Observability Observability tracks: ```text retrieval quality index freshness broken links agent usage search failures ``` ## 18.7 Security and Access Control Security and Access Control apply to: ```text sensitive documents restricted knowledge credentials in documentation agent access to knowledge index access retrieval audit ``` --- # 19. Canon Interface Card Usage Subsystems that implement or produce information-space knowledge SHOULD publish a Canon Interface Card. Example: ```yaml subsystem: markitect-tool implements: - InfoTechCanonInformationSpaceModel - MarkdownInfospaceProfile produces: - MarkdownDocument - FrontMatter - Index - RetrievalUnit - Link - ValidationResult consumes: - StandardDocument - ConceptPage - MappingDocument relations: - MarkdownDocument chunks_into RetrievalUnit - RetrievalUnit indexed_by SearchIndex - Link references KnowledgeArtifact source_of_truth: markdown_artifacts: git_repository known_deviations: - embedding storage may be external - generated indexes may be rebuilt from source ``` --- # 20. Retrieval Requirements The Information Space Model is itself designed for retrieval. ## 20.1 Required Retrieval Properties Every major artifact SHOULD provide: - stable identifier, - stable title, - artifact type, - status, - owner or steward, - source references, - related artifacts, - headings, - anchors, - summary, - front matter, - and retrievable sections. ## 20.2 Agent Brief A mature Information Space Model SHOULD include an `agent-brief.md` file with: ```text purpose scope owned concepts imported concepts artifact types front matter rules chunking rules retrieval rules do / do not rules common mistakes profile list mapping list ``` ## 20.3 Indexes The information space SHOULD provide indexes by: ```text artifact concept standard pattern profile mapping source status owner tag external reference retrieval unit use path ``` --- # 21. Conformance Levels ## 21.1 Reference-Conformant A repository or document set is reference-conformant if it uses Information Space terminology consistently but does not implement structured metadata or validation rules. ## 21.2 Metadata-Conformant A repository or document set is metadata-conformant if major artifacts have structured metadata and stable identifiers. ## 21.3 Link-Conformant A repository or document set is link-conformant if internal links, backlinks, citations, and references are represented and checkable. ## 21.4 Retrieval-Conformant A repository or document set is retrieval-conformant if artifacts are chunked, indexed, and retrievable with stable source context. ## 21.5 Provenance-Conformant A repository or document set is provenance-conformant if artifacts and important changes preserve source, activity, agent, and review records. ## 21.6 Profile-Conformant A repository or document set is profile-conformant if it implements a declared Information Space Profile and passes its validation rules. ## 21.7 Assimilation-Conformant A repository or document set is assimilation-conformant if it can represent assimilation workspaces and produce mappings, gaps, conflicts, and proposed changes. --- # 22. Validation Rules Initial validation rules: ```text VAL-INFOSPACE-001: Every major KnowledgeArtifact SHOULD have a stable id. VAL-INFOSPACE-002: Every StandardDocument SHOULD declare status, version, owner, and scope. VAL-INFOSPACE-003: Every ConceptPage SHOULD define exactly one primary concept. VAL-INFOSPACE-004: Generated views SHOULD be marked as generated or derived. VAL-INFOSPACE-005: Internal links SHOULD resolve to existing artifacts or anchors. VAL-INFOSPACE-006: External references SHOULD include source, access date or source version where relevant. VAL-INFOSPACE-007: RetrievalUnit SHOULD preserve artifact id, section path, version, and source context. VAL-INFOSPACE-008: EmbeddingRecord SHOULD reference source hash, embedding model, and chunking strategy. VAL-INFOSPACE-009: Summary SHOULD reference the artifact or retrieval unit it summarizes. VAL-INFOSPACE-010: AgentBrief SHOULD be derived from or reviewed against the full artifact. VAL-INFOSPACE-011: AssimilationReport SHOULD include source summary, extracted concepts, comparison matrix, mappings, proposed changes, and open questions. VAL-INFOSPACE-012: MappingDocument SHOULD declare source concept, target body, target concept, mapping type, scope, confidence, and rationale. VAL-INFOSPACE-013: Deprecated artifacts SHOULD reference replacements where available. VAL-INFOSPACE-014: Orphan artifacts SHOULD be reviewed for indexing, linking, archiving, or deletion. VAL-INFOSPACE-015: Conflicting definitions SHOULD create review work or mapping notes. VAL-INFOSPACE-016: Sensitive knowledge artifacts SHOULD reference Access Control, Security, Data, or Governance constraints where relevant. VAL-INFOSPACE-017: Tags MUST NOT replace stable identifiers, links, mappings, or metadata. VAL-INFOSPACE-018: Profiles MUST NOT redefine canonical concepts. They may constrain them. ``` --- # 23. Anti-Patterns ## 23.1 Markdown Pile A folder full of Markdown files without stable IDs, indexes, links, or metadata. ## 23.2 Chunk Soup Chunks created for retrieval without preserving document context, heading path, source, or version. ## 23.3 Summary Without Source Summaries detached from the source artifacts they summarize. ## 23.4 Link Rot Inside the Repo Internal links break because anchors and file paths are not validated. ## 23.5 View as Source Generated indexes or diagrams are edited manually and diverge from canonical artifacts. ## 23.6 Embedding Without Provenance Embeddings are stored without model, source hash, chunking strategy, or creation time. ## 23.7 Concept Drift by Duplication The same concept is defined in multiple places without canonical ownership. ## 23.8 Agent Brief as Replacement Agents use compact briefs that are stale or inconsistent with full standards. ## 23.9 Retrieval Without Evaluation Search and RAG are used without tests for relevance, freshness, and citation correctness. ## 23.10 External Standard Copy-Paste External standards are copied into the information space without mapping, assimilation, or source boundaries. --- # 24. Initial Repository Placement Recommended repository layout: ```text info-tech-canon/ standards/ information-space/ InfoTechCanonInformationSpaceModel.md agent-brief.md concepts/ relationships/ patterns/ profiles/ mappings/ assimilation/ examples/ validation/ ``` Seed files: ```text standards/information-space/InfoTechCanonInformationSpaceModel.md standards/information-space/agent-brief.md standards/information-space/concepts/information-space.md standards/information-space/concepts/knowledge-artifact.md standards/information-space/concepts/retrieval-unit.md standards/information-space/concepts/chunk.md standards/information-space/concepts/index.md standards/information-space/concepts/agent-brief.md standards/information-space/concepts/provenance-record.md standards/information-space/patterns/markdown-with-structured-front-matter.md standards/information-space/patterns/concept-page-per-canonical-concept.md standards/information-space/patterns/chunk-with-parent-context.md standards/information-space/patterns/agent-brief.md standards/information-space/profiles/infotechcanon-repository-profile.md standards/information-space/profiles/markdown-infospace-profile.md standards/information-space/profiles/agent-retrievable-standards-profile.md standards/information-space/profiles/assimilation-workspace-profile.md standards/information-space/profiles/rag-corpus-profile.md standards/information-space/mappings/skos.yaml standards/information-space/mappings/fair.yaml standards/information-space/mappings/prov-o.yaml standards/information-space/mappings/dublin-core.yaml ``` --- # 25. Roadmap ## Phase 1: Seed Stabilization - Establish this standard as `InfoTechCanonInformationSpaceModel`. - Add seed concepts, relationship vocabulary, patterns, and profiles. - Define validation rules. - Align with Core, Tagging, Data, Governance, DevSecOps, Observability, Security, and Access Control. ## Phase 2: First Assimilations Recommended first assimilations: ```text SKOS FAIR principles PROV-O Dublin Core / Singapore Framework CommonMark / Markdown conventions Obsidian / wiki-link practice Zettelkasten note practice DITA topic architecture RAG corpus and chunking patterns ``` ## Phase 3: Profile Maturation - Mature InfoTechCanon Repository Profile. - Mature Markdown Infospace Profile. - Mature Agent-Retrievable Standards Profile. - Mature Assimilation Workspace Profile. - Mature Sharded Wiki Profile. - Mature RAG Corpus Profile. ## Phase 4: Tooling Integration - Generate concept indexes. - Generate agent briefs. - Generate chunk manifests. - Generate machine-readable YAML/JSON exports. - Add validation scripts. - Add broken-link checks. - Add stale-content checks. - Add retrieval-quality tests. - Integrate with markitect-tool, kontextual-engine, shard-wiki, llm-connect, and phase-memory. ## Phase 5: Knowledge Intelligence Loop - Track retrieval failures. - Track stale concepts. - Track conflicting definitions. - Track missing mappings. - Track assimilation backlog. - Generate improvement tasks. - Use agent feedback to refine chunks, briefs, indexes, and profiles. --- # 26. Summary The InfoTechCanon Information Space Model is the seed standard for representing markdown-first, human-readable, machine-retrievable, provenance-aware, interconnected knowledge spaces. Its most important commitments are: ```text Separate domain meaning from knowledge-space packaging. Treat documents, sections, chunks, retrieval units, links, citations, indexes, summaries, agent briefs, provenance, and mappings as first-class artifacts. Make markdown useful for both humans and agents through structured metadata, stable identifiers, chunking rules, source references, and validation. Map to SKOS, FAIR, PROV-O, Dublin Core, Markdown, and RAG practices without surrendering internal semantic autonomy. Use profiles to make the model practical for the InfoTechCanon repository, markdown infospaces, sharded wikis, assimilation workspaces, and agent retrieval. ``` This makes the Information Space Model the structural substrate for turning InfoTechCanon from a collection of documents into a living, reusable, agent-operable knowledge system.