4
MarkiTectCli
Bernd Worsch edited this page 2025-09-21 21:07:00 +00:00

Guidance

Abstractor: MarkiTect python CommandLineInterface

We will be building a command-line tool to utilize the MarkiTect Python library conveniently. Here's a breakdown of the technologies to use and how they map to our requirements.

Core Technologies

Database: For a temporary, in-memory database, SQLite is the best choice. It's built into Python and is a file-based database, which makes it easy to set up and manage without a separate server process. You can use Python's built-in sqlite3 module.

GraphQL: The graphene library is a powerful and widely-used choice for building GraphQL APIs in Python. It integrates well with various databases, including SQLite. You can define your GraphQL schema and resolvers to interact with your data.

Markdown Parsing: The markdown-it-py library is a fast and extensible Markdown parser that generates a detailed AST. You can process this AST to extract the structural information you need for your schema. Another option is commonmark.

Schema Validation: You can use the jsonschema library to validate your generated ASTs against your defined schemas. This is a robust and standardized approach.

Thoughts about Architecture

Here's some orientation how to structure the functionality:

  1. Database and ORM

Instead of a raw DB interface, using an Object-Relational Mapper (ORM) like SQLAlchemy is highly recommended. SQLAlchemy will allow you to define your data models (e.g., MarkdownFile, SchemaFile, and their AST content) as Python classes, which makes it easier to work with than raw SQL. This will also simplify the integration with your GraphQL layer.

  1. GraphQL RW Interface

    Models: Define graphene.ObjectType models that correspond to your SQLAlchemy classes. This will expose your database content through GraphQL.

    Queries: Implement graphene.Field resolvers to read data from the database. You'll need queries to retrieve a single Markdown file's AST, a schema's content, or a list of all files.

    Mutations: Implement graphene.Mutation to handle write operations, such as adding a new Markdown file or a schema to the database.

  2. Read/Write Operations

    Read Markdown/Schema to DB:

     For a Markdown file, you'll use markdown-it-py to parse the file and generate a JSON representation of the AST. This JSON will then be stored in a dedicated field in your MarkdownFile table.
    
     For a Schema file, you'll simply read the JSON content and store it in your SchemaFile table.
    

    Write Markdown/Schema from DB:

     This is the reverse process. You'll read the content from the database and write it to a file. For Markdown, you might need a separate function to convert the AST back into Markdown text, though for this project, you may not need to since the AST is the core focus.
    
  3. Schema Generation and Validation

    Generate Schema:

     This is a key part of your project. You'll write a function that takes a Markdown AST and an integer for the desired nesting depth.
    
     You'll traverse the AST and build a JSON Schema. For each heading (#, ##, etc.) at or above the specified depth, you'll create a corresponding object in the schema with a title and properties to enforce the structure.
    

    Generate Markdown Stubs:

     This function will read a JSON Schema and build a new Markdown file based on it. It will add all the required headings and other structural elements from the schema, leaving empty content or a placeholder "stub" for each.
    

    Check if Markdown matches Schema:

     First, you'll generate the AST from the Markdown file.
    
     Then, you'll use a jsonschema validator to check the AST against the JSON Schema you've loaded from the database. The validator will return a boolean result and a list of errors if any are found. This will be the core of your validation logic.