teachclaw-deployment

Case Study: TeachClaw Deployment

One-Line Version

I built and deployed TeachClaw, a real AI assistant for UK teachers that turns chat messages into teaching artifacts, feedback, and marking support through a live OpenClaw runtime.

Problem

Teachers do not need another dashboard. They need low-friction help inside the workflow they already use: quick resource generation, feedback, marking, and next-lesson planning without fighting a new product surface.

System

TeachClaw is a chat-native AI teaching system across Telegram and browser/site surfaces. It uses OpenClaw as the runtime layer, with task-routing plugins, deterministic builder scripts, and isolated teacher VPS deployments.

Main user-facing capabilities:

Architecture

flowchart TD
    Teacher["Teacher via Telegram or Browser"] --> Gateway["OpenClaw Gateway"]
    Gateway --> Context["Agent Context: AGENTS, SOUL, USER, MEMORY"]
    Gateway --> Router["Oak Content Plugin / Task Router"]
    Router --> Lesson["Lesson / Deck Route"]
    Router --> Worksheet["Worksheet Route"]
    Router --> Marking["Marking Router"]
    Router --> Memory["Teacher Memory Adapter"]
    Lesson --> PPT["build-pptx.py"]
    Worksheet --> DOCX["build-docx.py"]
    Marking --> Pipeline["OCR / Transcript / Judgement / Report"]
    Memory --> Audit["Memory Events + Scoped Cards"]
    PPT --> Delivery["File Delivery"]
    DOCX --> Delivery
    Pipeline --> Delivery
    Delivery --> Teacher

What I Owned

Hard Problems

Runtime Drift

TeachClaw has several truth layers: repo source, runtime contract mirror, local gateway payload, and live teacher VPS payload. A bug can appear fixed locally while the gateway still loads stale code. I handled this by checking the exact loaded layer before calling a fix real.

Artifact Quality

For teaching artifacts, schema-valid output is not enough. A .pptx can pass mechanical checks but still be pedagogically weak. The validation model separates runtime success from teacher-facing quality judgement.

Deployment Safety

Live teacher lanes are pinned and guarded. Risky changes go through local tests, local test gateway, agentic evals, and only then guarded live smoke when explicitly approved.

Proof

Deployment Lessons

Role Relevance

This maps directly to AI Deployment / Forward Deployed Engineering: