Files
markitect-main/markitect/core/parser.py
tegwick 9b12875681 feat(spaces): implement Phase 0-1 of Information Space Service
Phase 0 - Project Organization:
- Create docs/PROJECT_STRUCTURE.md documenting codebase layout
- Create markitect/core/ with parser, serializer, document_manager, workspace
- Create markitect/schema/ consolidating 6 schema_*.py modules
- Create markitect/storage/ with database module
- Maintain backward compatibility via re-exports from original locations
- Add docs/roadmap/information-space-service/ with README and WORKPLAN

Phase 1 - Foundation (Weeks 1-3):
- Week 1: Core domain models (InformationSpace, SpaceDocument, SpaceConfig,
  SpaceMetadata, SpaceVariable, TransclusionReference, SpaceStatus)
- Week 2: Repository layer with interfaces (ISpaceRepository,
  IDocumentAssociationRepository, IVariableRepository, IReferenceRepository)
  and SQLite implementations with foreign key cascade deletes
- Week 3: SpaceService orchestration layer with full CRUD, document,
  variable, and reference tracking operations

Test coverage: 124 tests (25 model + 63 repository + 36 integration)

Capabilities delivered:
- CAP-001: InformationSpace entity with lifecycle management
- CAP-002: SpaceRepository CRUD with SQLite backing
- CAP-003: Document-Space associations with path-based organization
- CAP-004: Space metadata and configuration schemas
- CAP-005: Database schema with migrations

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-08 02:02:46 +01:00

48 lines
1.4 KiB
Python

"""
Markdown AST Parser.
This module provides functionality to parse markdown content into an
Abstract Syntax Tree (AST) using the markdown-it library.
"""
from markdown_it import MarkdownIt
def parse_markdown_to_ast(md_content: str):
"""
Parse markdown content into a JSON-serializable AST.
Args:
md_content: Markdown text to parse
Returns:
List of token dictionaries representing the AST
Example:
ast = parse_markdown_to_ast("# Hello\\n\\nWorld")
"""
# Enable table parsing and other common plugins
md = MarkdownIt("commonmark", {"tables": True}).enable(['table'])
tokens = md.parse(md_content)
# Convert to a JSON-serializable list of dicts
def token_to_dict(token):
d = {
'type': token.type,
'tag': token.tag,
'attrs': token.attrs,
'map': token.map,
'nesting': token.nesting,
'level': token.level,
'children': [token_to_dict(child) if child else None for child in token.children] if token.children else None,
'content': token.content,
'markup': token.markup,
'info': token.info,
'meta': token.meta,
'block': token.block,
'hidden': token.hidden
}
return {k: v for k, v in d.items() if v is not None} # Remove None values
return [token_to_dict(token) for token in tokens]