Files
kontextual-engine/docs/cmis-compliance-assessment.md

100 lines
6.4 KiB
Markdown

# CMIS Compliance Assessment
Date: 2026-05-06
Status: planning baseline for CMIS compliance and access-point implementation.
## Reference Standard
Target CMIS version: OASIS Content Management Interoperability Services
Version 1.1, OASIS Standard, approved 23 May 2013, including approved errata
where applicable.
CMIS defines a domain model plus Web Services, AtomPub, and Browser JSON
bindings for one or more content repositories. The standard explicitly allows a
CMIS endpoint to expose more than one repository and does not require every
underlying content-management feature to be represented through CMIS.
## Reusable Validation Foundation
Primary reusable validation candidate: Apache Chemistry OpenCMIS TCK and CMIS
Workbench.
OpenCMIS provides client libraries, server frameworks, development tools,
InMemory/FileShare reference repositories, and TCK artifacts. The project pages
now indicate the project is retired, so we should treat OpenCMIS as a legacy
compatibility validation tool rather than a moving dependency. The Maven
artifact `org.apache.chemistry.opencmis:chemistry-opencmis-test-tck:1.1.0`
remains available and should be used as the first external conformance harness.
Practical strategy:
- Build local, deterministic example fixtures grouped by CMIS service
capability.
- Build internal contract tests that validate our mapper and profile behavior
without Java tooling.
- Add an optional external TCK harness that can run OpenCMIS TCK against a
running CMIS access point when Java/Maven are available.
- Keep TCK execution optional in the default Python suite to avoid turning the
engine into a Java project.
## Capability Assessment
| CMIS capability | Current engine availability | Gap | Demand |
| --- | --- | --- | --- |
| Repository service | Service health/version, runtime repository state, capability catalogs. | Need CMIS repository info, repository IDs, root folder IDs, capability flags, type summaries. | Low |
| Type definitions | Asset classifications, metadata schemas, relationship target kinds. | Need CMIS base types, property definitions, type mutability flags, secondary type projection. | Medium |
| Navigation service | Relationships and context graph exist, but no folder tree model. | Need root folder, folder children, descendants/tree, parent relationships, path semantics. | High |
| Object service read | Assets, metadata, representations, content refs, audit, versions exist. | Need CMIS object envelopes, allowable actions, path/object-id lookup, property filters, rendition/content stream response shape. | Medium |
| Object service write | Asset create, metadata add, lifecycle transition, relationship create, ingestion. | Need createDocument/createFolder/updateProperties/deleteObject/moveObject mapping and CMIS change tokens. | High |
| Content streams | Source, normalized, derived representations store content hashes and storage refs. | Need getContentStream/setContentStream/deleteContentStream/appendContentStream semantics and streaming endpoints. | Medium-High |
| Versioning | Asset versions and transformation/workflow lineage exist. | Need CMIS checkout, PWC, checkin, cancelCheckout, version series semantics, latest/major flags. | High |
| Discovery/query | Governed retrieval, lexical search, filters, relationships. | Need CMIS SQL-like query grammar or supported subset, query result shape, joins/capability flags. | High |
| Relationships | Core relationships exist. | Need CMIS relationship object mapping and relationship type capability exposure. | Medium |
| ACL service | Policy gateway and authorization decisions exist. | Need CMIS ACL model, principals, direct/inherited ACEs, applyACL, exact capability flags. | High |
| Policy service | Policy decisions and governance reports exist. | Need CMIS policy objects/applyPolicy/removePolicy/getAppliedPolicies mapping or explicit unsupported profile. | Medium |
| Change log | Audit events and correlation IDs exist. | Need CMIS change events, change tokens, object change entries, paging. | Medium |
| Multi-filing/unfiling | Not modeled directly. | Need folder membership model or profile-level unsupported flags. | High if full support, Low if unsupported |
| Renditions | Representations exist, no rendition taxonomy. | Need rendition metadata and stream mapping for thumbnails/previews. | Medium |
| Retention and hold | Metadata/governance hooks exist, no first-class legal hold model. | Need retention/hold capabilities, apply/remove hold, retention date semantics. | High for full support |
| Bulk update | Metadata update pathways exist. | Need bulkUpdateProperties semantics, partial failure reporting, change tokens. | Medium |
| Browser JSON binding | FastAPI JSON service already exists. | Need CMIS Browser Binding routes, selectors/actions, multipart/content stream behavior. | High |
| AtomPub binding | No AtomPub/XML binding. | Need XML/Atom feed generation and protocol semantics. | Very High |
| Web Services binding | No SOAP stack. | Need WSDL/SOAP implementation. | Very High |
## Recommended Compliance Profile Strategy
Start with a constrained CMIS 1.1 Browser Binding profile:
- Repository, type, object read, content stream read, query subset,
relationships, change log, and navigation over a synthetic root/folder
projection.
- Explicitly unsupported or read-only: AtomPub, Web Services, full ACL mutation,
retention/hold, multifiling/unfiling, and full CMIS SQL joins.
Then expand by profile:
- `readonly-browser`: safe read-only repository and content access.
- `governed-authoring`: selected object creation/update/content stream changes
through engine policy and audit.
- `admin-export`: broad export and governance inspection, restricted to
service accounts.
- `compat-tck`: profile tuned to pass a selected OpenCMIS TCK capability subset.
## Risk Summary
The engine already has strong foundations for asset identity, metadata,
representations, relationships, versions, audit, policy, retrieval, and
service APIs. The hard parts are not storage; they are CMIS protocol semantics:
folder/path behavior, versioning/PWC semantics, CMIS query grammar, ACL shape,
content stream actions, and binding-specific compatibility.
Best estimate:
- Internal mapper and examples: moderate.
- Browser Binding MVP profile: medium-high.
- TCK subset harness: medium.
- Broad CMIS 1.1 Browser compliance: high.
- AtomPub and Web Services compliance: very high and probably not justified
until a real client demands those bindings.