Skip to content

Instantly share code, notes, and snippets.

@possibilities
Created March 11, 2026 16:03
Show Gist options
  • Select an option

  • Save possibilities/02284629105e24551ddc2349937968bb to your computer and use it in GitHub Desktop.

Select an option

Save possibilities/02284629105e24551ddc2349937968bb to your computer and use it in GitHub Desktop.
codexctl render-markdown example — Codex design discussion rendered as markdown
session-id backend model cli-version cwd git-branch
019cd963-cb29-7cf2-becb-4777b249ee70
codex
gpt-5.4
0.113.0
/Users/mike/code/arthack
main

User

Design Challenge: Dynamic Context Injection System for Claude Code

Background

I have a polyglot monorepo (Python CLIs + JS/TS apps) with a Claude Code plugin ecosystem. Claude Code supports hooks that fire at different lifecycle points and can inject text into the agent's context. The current system uses:

  • Command templates (Jinja2 .md.tmpl files) that render to slash commands. These compose via partials and variants (e.g., /develop includes a worker partial with all CLI docs, /develop-web adds web-specific context)
  • Skills (SKILL.md files) that agents can load on-demand for multi-phase workflows
  • Hooks that fire on SessionStart, UserPromptSubmit, PreToolUse, PostToolUse, Stop, etc.
  • knowctl - semantic search over 60+ versioned documentation topics
  • claudectl - search past Claude Code sessions
  • ~20 CLIs each with --help and --agent-help text

The Problem

Right now, command templates are HUGE monolithic prompts (the /develop command alone is ~4000 tokens). They frontload everything because there's no dynamic system to inject the right context at the right time. We need a snippet-based system that:

  1. Composes customized advice per prompt - When a user submits a prompt, analyze it and inject only the relevant snippets
  2. Streams advice during conversation - As the agent works, inject more context via hook calls (PostToolUse, PreToolUse) based on what it's doing
  3. Scales across many dimensions - CLI help text, knowctl topics, code context, session history, project conventions
  4. Supports graduated disclosure - For a knowctl topic, maybe just show the name; or the name + summary; or full --agent-help; depending on relevance
  5. Uses LLM evaluation for non-obvious matches - Some snippets trigger on file patterns, but others need semantic evaluation

Injection Points (Claude Code Hook Events)

  • SessionStart: General advice, project setup
  • UserPromptSubmit: The big one - analyze the user's prompt and compose a custom context payload
  • PostToolUse: After any tool call, can inject advice (e.g., after writing a .py file, remind about ruff)
  • PreToolUse: Before tool execution, can modify or advise
  • Stop: Before the agent stops, can inject final checklist
  • PreCompact: Before context compression, can preserve critical info

Key Constraint

Hook scripts must be fast. UserPromptSubmit hooks block the prompt from reaching the agent. So we need:

  • Fast keyword/pattern matching for obvious triggers
  • Optional async LLM evaluation for semantic matching (maybe pre-computed at session start or on a queue)
  • Pre-rendered snippet libraries that can be composed quickly

The Context Tool (Not Yet Built)

A tool modeled after Repoprompt that analyzes a codebase and produces a compact context representation. This would:

  • Identify key files, patterns, and architecture
  • Run early (parallel with other analysis) since its output informs everything else
  • Produce structured output (file tree, key symbols, dependency graph fragments)

What I Need From You

Design a complete system architecture for this. Think about:

  1. Snippet format - What does a snippet definition look like? YAML? What metadata (triggers, priority, disclosure levels, dependencies)?
  2. Trigger system - How do we decide which snippets fire? File patterns, keyword matching, semantic similarity, LLM evaluation?
  3. Composition pipeline - How do snippets get assembled into a final payload? Token budgets, priority ordering, dedup?
  4. Lifecycle - What happens at each hook point? What's the flow from SessionStart through multiple UserPromptSubmit events?
  5. The context tool integration - How does repo context feed into snippet selection?
  6. LLM-in-the-loop - Where does LLM evaluation fit without adding too much latency?
  7. Configuration - Directory structure, config files, how to add new snippets
  8. Graduated disclosure - How to serve the same content at different detail levels

Consider using json-render (a Vercel framework for generating UI from JSON specs) as an "assembly language" for composing snippet payloads.

Be concrete. Show example YAML configs, directory structures, and pseudocode for the pipeline. Think about what existing tools (knowctl, promptctl, etc.) can be reused vs what's new.

Thought (0 lines)

Shell pwd && rg --files -g 'AGE...aude/**/*.yaml' -g 'claude/**/*.yml' | sed 's#^#./#' (159 lines)

Shell sed -n '1,220p' claude/plugins/arthack/commands/develop.md (226 lines)

Thought (0 lines)

Shell rg -n "promptctl|context ...|json-render|repoprompt|knowctl|claudectl" claude -S (226 lines)

Shell sed -n '1,220p' claude/plugins/arthack/hooks/hooks.json (62 lines)

Shell sed -n '1,220p' claude/pl...ills/create-hook/references/hook-events-reference.md (226 lines)

Shell sed -n '1,220p' claude/pl...skills/create-hook/references/hook-output-schemas.md (226 lines)

Thought (3 lines)

Shell rg --files | rg 'promptct...partials|skills/.*/SKILL\.md$|plugin\.json|manifest' (190 lines)

Shell sed -n '1,220p' claude/plugins/arthack/CLAUDE.md (17 lines)

Shell sed -n '1,220p' claude/plugins/CLAUDE.md (31 lines)

Thought (3 lines)

Shell sed -n '1,240p' claude/plugins/arthack/hooks/session_start.py (91 lines)

Shell sed -n '1,240p' apps/promptctl/promptctl/cli.py (86 lines)

Shell sed -n '1,260p' apps/promptctl/promptctl/helpers.py (146 lines)

Shell sed -n '1,260p' apps/claudectl/claudectl/run_watch_prompts.py (266 lines)

Thought (1 line)

Shell sed -n '1,220p' apps/promptctl/promptctl/run_render_prompt.py (33 lines)

Shell sed -n '1,260p' apps/knowctl/knowctl/run_semantic_search.py (85 lines)

Shell sed -n '1,260p' apps/claudectl/claudectl/run_semantic_search.py (75 lines)

Shell sed -n '1,220p' apps/knowctl/CLAUDE.md (28 lines)

Thought (3 lines)

Shell sed -n '1,240p' apps/knowctl/config/topics/json-render.yaml (22 lines)

Shell sed -n '1,240p' apps/knowctl/config/topics/repoprompt.yaml (47 lines)

Shell knowctl semantic-search json-render "overview architecture JSON spec" --limit 3 (6 lines)

Shell knowctl semantic-search r...ase context output structure tree symbols" --limit 3 (6 lines)

Thought (0 lines)

write_stdin (69 lines)

write_stdin (52 lines)

Thought (0 lines)

Shell knowctl show-content json-render-095 (6 lines)

Shell knowctl show-content json-render-098 (6 lines)

Shell knowctl show-content repoprompt-030 (6 lines)

Thought (0 lines)

write_stdin (209 lines)

write_stdin (305 lines)

write_stdin (162 lines)

Thought (3 lines)

Shell sed -n '1,220p' claude/pl...kills/create-hook/examples/session-context-loader.md (226 lines)

Shell sed -n '1,260p' claude/pl...create-hook/references/decision-control-reference.md (266 lines)

Shell sed -n '1,220p' apps/knowctl/knowctl/helpers.py (226 lines)

Shell sed -n '1,220p' apps/claudectl/claudectl/search_helpers.py (226 lines)

Thought (3 lines)

Shell sed -n '1,220p' claude/plugins/arthack/command-templates/develop.md.tmpl (23 lines)

Shell sed -n '1,260p' claude/pl...s/arthack/command-templates/_partials/worker.md.tmpl (202 lines)

Shell sed -n '1,260p' claude/pl.../arthack/command-templates/_partials/develop.md.tmpl (202 lines)

Thought (3 lines)

Shell sed -n '1,220p' claude/pl...ls/create-skill/references/progressive-disclosure.md (226 lines)

Shell sed -n '1,220p' claude/pl.../skills/create-skill/references/context-detection.md (226 lines)

Shell sed -n '1,220p' claude/plugins/arthack/skills/compress-prompts/SKILL.md (226 lines)

Thought (0 lines)

Shell sed -n '1,220p' claude/plugins/claude/skills/create-hook/SKILL.md (226 lines)

Shell sed -n '1,220p' claude/plugins/arthack/skills/create-pipeline/SKILL.md (226 lines)

Thought (0 lines)

Shell find . -maxdepth 3 -type ...ath './claude/plugins/*/command-templates' \) | sort (10 lines)

Shell rg -n "promptctl|knowctl|...oprompt" apps config claude -g '!**/node_modules/**' (422 lines)

Thought (3 lines)

Shell sed -n '1,220p' apps/vectorctl/vectorctl/cli.py (76 lines)

Shell sed -n '1,220p' apps/cli_common/cli_common/vectorctl_client.py (167 lines)

Shell sed -n '1,200p' apps/claudectl/CLAUDE.md (49 lines)

Thought (57 lines)

Assistant

{ "status": "planning_complete", "step_number": 6, "total_steps": 6, "metadata": { "objective": "Replace monolithic command templates with hook-driven, token-budgeted context injection.", "skills_used": [ { "name": "claude:create-hook", "reason": "Hook event schemas, lifecycle semantics, and output constraints." }, { "name": "compress-prompts", "reason": "Token-budget stratification, invariants, and graduated disclosure." } ], "inspected_files": [ "[File: claude/plugins/arthack/command-templates/develop.md.tmpl]", "[File: claude/plugins/arthack/command-templates/_partials/worker.md.tmpl]", "[File: claude/plugins/arthack/command-templates/_partials/develop.md.tmpl]", "[File: claude/plugins/arthack/hooks/hooks.json]", "[File: claude/plugins/arthack/hooks/session_start.py]", "[File: claude/plugins/claude/skills/create-hook/SKILL.md]", "[File: claude/plugins/claude/skills/create-hook/references/hook-events-reference.md]", "[File: claude/plugins/claude/skills/create-hook/references/hook-output-schemas.md]", "[File: apps/promptctl/promptctl/helpers.py]", "[File: apps/knowctl/knowctl/helpers.py]", "[File: apps/claudectl/claudectl/search_helpers.py]", "[File: apps/cli_common/cli_common/vectorctl_client.py]" ], "proposed_architecture": { "runtime": "New apps/contextctl CLI/service; hooks stay as thin adapters that call contextctl hook --event <Event>.", "snippet_format": "YAML metadata plus Markdown/Jinja bodies with compact, standard, and full disclosure levels.", "assembly_ir": "Use json-render as an intermediate section/block spec, then flatten to markdown/plaintext for Claude hook additionalContext.", "state": "Compiled registry in cache, per-session injection state in ~/.local/state/contextctl/, vector-backed semantic indexes via existing vectorctl daemon.", "reuse": [ "Reuse promptctl rendering and token counting for snippet compilation.", "Reuse vectorctl IPC/search path for semantic retrieval.", "Reuse knowctl and claudectl as content/session providers instead of rebuilding those stores." ] }, "proposed_layout": [ "apps/contextctl/", "config/context/policies/events.yaml", "config/context/providers.yaml", "config/context/snippets///snippet.yaml", "config/context/snippets///levels/{compact,standard,full}.md", "claude/plugins/arthack/hooks/context_session_start.py", "claude/plugins/arthack/hooks/context_user_prompt_submit.py", "claude/plugins/arthack/hooks/context_tool_lifecycle.py" ], "default_event_budgets": { "SessionStart": 400, "UserPromptSubmit": 1200, "PreToolUse": 250, "PostToolUse": 300, "Stop": 200, "PreCompact": 300 }, "latency_targets": { "UserPromptSubmit_fast_path_p95_ms": 250, "PreToolUse_p95_ms": 150, "async_semantic_followup_target_ms": 2000 }, "assumptions": [ "A new local CLI can be added and invoked from hooks.", "Hook payloads must remain strings, so json-render is internal only." ] }, "files_required_to_continue": { "needed": false, "files": [], "reason": "Relevant prompt, hook, and retrieval infrastructure was inspected." }, "plan": [ { "step_number": 1, "title": "Freeze the contextctl contract and snippet authoring model", "depends_on": [], "goal": "Define one runtime boundary, one snippet schema, and one policy format before any hook work.", "deliverables": [ "CLI surface: compile, hook, scan-repo, dry-run, doctor.", "Snippet schema fields: id, family, events, triggers, priority, dependencies, conflicts, dedupe_key, cooldown, budgets, disclosure_levels, sources, render_hints.", "Authoring layout under config/context/snippets/ plus schema validation and token counting at compile time." ], "validation_gate": "A sample snippet set compiles into a normalized registry with per-level token counts and no schema ambiguities.", "alternatives": [ "Keep extending promptctl/Jinja only; reject because runtime scoring becomes ad hoc.", "Skip json-render and concatenate markdown directly; acceptable fallback if IR complexity is not paying for itself." ], "risks": [ { "risk": "Schema sprawl makes snippets hard to author.", "mitigation": "Keep v1 fields minimal and push exotic logic into providers/policies." } ], "next_actions": [ "Write the schema and a compiler first.", "Create 3 reference snippets: CLI-help, knowctl-topic, repo-convention." ] }, { "step_number": 2, "title": "Build a three-stage trigger and scoring engine", "depends_on": [ 1 ], "goal": "Select candidates fast on the blocking path and reserve heavier reasoning for gray-area matches.", "deliverables": [ "Stage 1 deterministic rules: keywords, regexes, file globs, tool matchers, repo facets, changed-file matchers.", "Stage 2 vector recall over snippet summaries and provider summaries via vectorctl.", "Stage 3 optional LLM rerank only for uncertain top-k candidates, with cache keys based on prompt hash plus repo profile hash." ], "validation_gate": "Warm-cache UserPromptSubmit stays under target latency while returning stable top candidates on a replay set.", "alternatives": [ "Keyword-only v1 for the first rollout.", "Pure LLM classification; reject for latency and cost." ], "risks": [ { "risk": "Vector or LLM scoring adds blocking latency.", "mitigation": "Fast path returns deterministic plus cached vector results; rerank runs only inside a confidence band or asynchronously." }, { "risk": "False positives flood the agent.", "mitigation": "Family caps, confidence thresholds, and per-event cooldowns." } ], "next_actions": [ "Define score weights and thresholds.", "Create a replay corpus from prior prompts and hook events." ] }, { "step_number": 3, "title": "Add provider adapters and the repo context scanner", "depends_on": [ 1, 2 ], "goal": "Turn existing assets into fast, structured features that the scorer can consume.", "deliverables": [ "Provider adapters for CLI --agent-help, knowctl topic metadata/snippets, claudectl session matches, and project conventions.", "New repo scanner that emits compact JSON: file tree, key configs, package graph, symbol stubs, test/lint commands, language facets.", "Provider cache invalidation keyed by git HEAD plus file fingerprints." ], "validation_gate": "Scanner output is deterministic across repeated runs and updates only when relevant repo facts change.", "alternatives": [ "Shell out to RepoPrompt as an external spike.", "Start without symbol extraction and add AST/codemap support later." ], "risks": [ { "risk": "Large monorepos make repo scanning too slow.", "mitigation": "Incremental scans, depth limits, and language-specific summarizers." } ], "next_actions": [ "Ship repo scanner with cheap facts first: tree, configs, package manifests, task commands.", "Treat AST codemaps as a second pass." ] }, { "step_number": 4, "title": "Implement composition, dedupe, and graduated disclosure", "depends_on": [ 1, 2, 3 ], "goal": "Assemble the smallest useful payload for each event without repeating families or over-explaining.", "deliverables": [ "Selection pipeline: expand dependencies, resolve conflicts, dedupe by dedupe_key, apply family caps, then fit to event budget.", "Disclosure chooser that upgrades or downgrades compact/standard/full based on score, novelty, and remaining tokens.", "json-render IR with section types like Callout, Checklist, Reference, ToolHint, then markdown/plaintext renderers for hooks." ], "validation_gate": "Snapshot tests show deterministic output ordering and hard budget enforcement for identical inputs.", "alternatives": [ "Markdown-only builder if json-render adds runtime weight without enough structure benefit." ], "risks": [ { "risk": "Payload oscillates between turns and feels noisy.", "mitigation": "Track previously injected snippets and use hysteresis/cooldowns before re-injecting or expanding detail." } ], "next_actions": [ "Define the IR and renderer contract.", "Add golden tests for budget trimming and disclosure escalation." ] }, { "step_number": 5, "title": "Wire the full hook lifecycle around shared session state", "depends_on": [ 2, 3, 4 ], "goal": "Make each hook a thin, event-specific view over the same engine and session ledger.", "deliverables": [ "SessionStart warms provider caches, runs repo scan in parallel, and injects only base repo/setup snippets.", "UserPromptSubmit uses the fast path, records pending async enrichments, and appends the composed payload.", "PreToolUse/PostToolUse watch tool intent and file writes for reactive advice; Stop and PreCompact emit compact checklists/preserved facts.", "Per-session state tracks injected snippets, source hashes, open async jobs, last event fingerprints, and preserved compaction facts." ], "validation_gate": "Hook outputs match Claude schemas exactly, are idempotent on retries, and do not create reinjection loops.", "alternatives": [ "Roll out only SessionStart plus UserPromptSubmit first, then add reactive hooks." ], "risks": [ { "risk": "Blocking hooks degrade UX.", "mitigation": "Keep hooks thin, move heavy work behind caches/queues, and provide deterministic fallbacks when async work is pending." } ], "next_actions": [ "Implement contextctl hook --event UserPromptSubmit in shadow mode first.", "Add shared session-state read/write helpers before multiple event handlers." ] }, { "step_number": 6, "title": "Run shadow evaluation, then migrate /develop into snippets", "depends_on": [ 1, 2, 3, 4, 5 ], "goal": "Prove relevance and latency before replacing legacy prompts, then decompose the monolith without losing critical guidance.", "deliverables": [ "Shadow-mode logs: fired candidates, scores, chosen disclosure levels, token counts, latency, and skipped snippets.", "Replay harness using claudectl session history plus prompt corpora for regression checks.", "Migration ledger that maps current /develop sections to snippet families and preserves invariants called out by the compression workflow." ], "validation_gate": "Shadow results show acceptable latency and relevance, and migrated snippets preserve the critical warnings/rules now embedded in /develop.", "alternatives": [ "Keep legacy templates as a fallback command while the new engine runs in parallel." ], "risks": [ { "risk": "Important guidance disappears during decomposition.", "mitigation": "Create an invariants ledger from the current prompt and require every removed section to map to a snippet, provider, or hook." } ], "next_actions": [ "Start with log-only UserPromptSubmit plus PostToolUse evaluation.", "Migrate the highest-value /develop families first: CLI index, tool-selection rules, repo conventions, and reactive lint/test reminders." ] } ], "plan_summary": "Create a new apps/contextctl runtime and keep hooks thin. The system should compile YAML-plus-Markdown snippets into a normalized registry with token counts, disclosure levels, dependencies, conflicts, and event scopes. Retrieval should run in three stages: deterministic rules first, vector recall second via the existing vectorctl daemon, and optional cached LLM reranking only for ambiguous cases. Repo context should come from a new cached scanner that emits compact structured facts about the monorepo; existing knowctl, claudectl, and CLI --agent-help outputs become providers rather than being pasted wholesale into prompts.\n\nComposition should be budgeted per hook event, dedupe by family, and choose compact/standard/full detail levels based on score, novelty, and remaining tokens. Use json-render as an internal assembly language for sections and ordering, then flatten to markdown/plaintext because Claude hooks only accept string additionalContext fields. UserPromptSubmit must stay on a fast path; deeper semantic enrichment should be async and show up on later hook events or on the next turn.\n\nBiggest risks are hook latency, over-injection, and knowledge loss while decomposing /develop. Mitigate with precompiled registries, warm vector caches, family caps, cooldowns, shadow-mode logging, and an invariants ledger for every critical rule now living in the monolith. Immediate next actions: finalize the schema and event budgets, scaffold contextctl with compile and dry-run, implement a shadow-mode UserPromptSubmit hook, then add the repo scanner and replay harness before migrating prompt content." }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment