Skip to content

Instantly share code, notes, and snippets.

@avargaskun
Last active March 9, 2026 17:47
Show Gist options
  • Select an option

  • Save avargaskun/c95443b76b0deb58aa494c1f5bcdd06e to your computer and use it in GitHub Desktop.

Select an option

Save avargaskun/c95443b76b0deb58aa494c1f5bcdd06e to your computer and use it in GitHub Desktop.
ralph — an agentic loop infrastructure for Claude Code
logs/
projects/*/.base-branch

Ralph — Phase-by-Phase Plan Executor

Automated loop that feeds Claude Code a plan and executes it one phase at a time.

Setup

Copy the files from this repo into a folder named ralph/ at the root of the repo where you want to use the loop.

Project structure

Each project lives in its own folder under ralph/projects/:

ralph/
├── loop.sh              # Shell script that runs the Claude Code loop
├── PROMPT.md            # Prompt template ({{DESIGN_FILE}}, {{PLAN_FILE}} are substituted)
├── DESIGN.md            # Guide for writing design documents
├── PLAN.md              # Guide for writing execution plans
├── REVIEW.md            # Guide for post-execution code reviews
├── ADDRESS.md           # Guide for fixing review findings
├── logs/                # Auto-created directory with per-iteration logs
└── projects/
    └── <project-name>/
        ├── design.md    # Design specification (created first during planning)
        ├── plan.md      # Execution plan with phases and checkboxes
        └── review.md    # Post-execution review (written after all phases complete)

SDLC Workflow

Ralph projects follow a four-phase lifecycle: Design → Execute → Review → Address. Each phase has a guide document and produces an artifact in the project folder.

1. Design (DESIGN.mddesign.md)

Create the design specification first. Use ralph/DESIGN.md as the guide:

Create a design for <feature description>. Use @ralph/DESIGN.md as context.

The design captures the what and why — architecture, locking analysis, edge cases, and trade-offs. It ends with Open Questions for the user to resolve interactively. The design is ready when all questions are marked _Resolved_.

Output: ralph/projects/<name>/design.md

2. Plan & Execute (PLAN.mdplan.md)

Create the execution plan from the finalized design. Use ralph/PLAN.md as the guide:

Create a plan for @ralph/projects/<name>/design.md. Use @ralph/PLAN.md as context.

The plan captures the how — phased steps with checkboxes, code sketches, and test tasks. Then run it:

./ralph/loop.sh <project-name>             # Run the loop

The loop executes phases autonomously, recording observations and committing after each one.

Output: ralph/projects/<name>/plan.md (with all checkboxes marked and observations filled)

3. Review (REVIEW.mdreview.md)

After all phases complete, request a review. Use ralph/REVIEW.md as the guide:

Review the changes performed as part of the @ralph/projects/<name>/plan.md.
Use @ralph/REVIEW.md as context.

The review reads the design, plan, observations, all changed files, and git diffs. It produces a verdict with findings categorized as Critical, Important, or Trivial, plus a design compliance checklist and test coverage assessment.

Output: ralph/projects/<name>/review.md

4. Address Findings (ADDRESS.md → updates review.md)

After the review, fix any findings that need attention. Use ralph/ADDRESS.md as the guide:

Fix the findings in @ralph/projects/<name>/review.md. Use @ralph/ADDRESS.md as context.

Claude reads each finding, implements the fix, builds to verify, and appends a > **Resolution:** blockquote to the finding in the review document — preserving the original finding text. Findings are processed sequentially to avoid file conflicts.

Output: Updated ralph/projects/<name>/review.md (with resolutions appended to each finding) + code fixes

When to skip steps

  • No design doc needed: Bug fixes, simple configuration changes, UI-only tweaks. Put a "Design Decisions" section directly in the plan.
  • No plan needed: One-off tasks that don't need phased execution.
  • Always review: Any project that used the automated loop should be reviewed before merging.

Setup

  1. Create your project under ralph/projects/<name>/ with at least design.md and plan.md.

  2. Run the loop:

    ./ralph/loop.sh <project>            # Run until all phases complete (max 50)
    ./ralph/loop.sh <project> 10         # Run at most 10 iterations

    Example:

    ./ralph/loop.sh partitioned-history-db

How it works

Branching

The loop manages a feature branch for each project:

  1. Start: Creates a ralph/<project-name> branch from the current branch
  2. During execution: Each phase is committed on the feature branch with a Phase N: prefix
  3. Completion: Merges back to the starting branch with --no-ff, preserving the full phase-by-phase history in git log --graph

If interrupted, re-run the same command — the loop detects the existing feature branch and resumes. The base branch is stored in ralph/projects/<name>/.base-branch (gitignored).

main ─────────────────────────────────●── merge commit ──▶
                                     ╱
ralph/my-project ── Phase 0 ── Phase 1 ── Phase 2

Each iteration

  1. The loop substitutes project paths into PROMPT.md and pipes it to Claude Code
  2. Claude reads the plan and finds the next unchecked phase
  3. Executes that single phase (code changes, builds, tests)
  4. Records discoveries in the Observations section of the plan
  5. Marks the phase complete and commits (prefixed with Phase N:)
  6. Outputs RALPH_PHASE_COMPLETE (loop continues) or RALPH_ALL_COMPLETE (loop stops)
  7. On RALPH_ALL_COMPLETE: merges the feature branch back with --no-ff

Safety

  • --dangerously-skip-permissions is required for autonomous operation
  • Max iterations acts as a safety net
  • Ctrl+C to abort at any time
  • Logs are saved per project and iteration for review

Ralph — Addressing Review Findings

Use this document as context when fixing findings from a completed review. This is the step between reviewing and merging — it turns review findings into code changes and records what was done.

How to Request

Point Claude at the completed review and ask it to address findings:

Fix the findings in @ralph/projects/<project-name>/review.md. Use @ralph/ADDRESS.md as context.

Claude will read the review, fix each finding, and update the review document with resolutions.

Instructions

1. Read the review

Read the review file (ralph/projects/<project-name>/review.md). Identify all findings under the Findings section (Critical, Important, and Trivial).

2. Read project conventions

Read the CLAUDE.md in the repository root and any relevant submodule roots before making changes.

3. Address findings sequentially

Process each finding one at a time, in severity order (Critical first, then Important, then Trivial). For each finding:

  1. Read the relevant code referenced in the finding.
  2. Implement the fix. Follow the review's suggested fix if one is provided, or use your judgment for the best approach.
  3. Build and test after each fix to verify nothing is broken: npm run gulp -- build --agent && npm run gulp -- run:unit --agent
  4. Update the review document — append a > **Resolution:** blockquote immediately after the finding's description, summarizing what was done.

4. Preserve the original finding text

Critical: When updating the review document, do NOT replace or remove the original finding description. The finding text (problem statement, code snippets, file references, suggested fix) must remain intact. Only add the resolution blockquote after it and append — **FIXED** to the finding's title line.

The result should look like:

**I1. Description of the finding****FIXED**

File: `Path/To/File.cs:42`

Original description of the problem, including any code snippets
and analysis from the reviewer...

> **Resolution:** Brief description of what was changed to fix the finding.

5. Findings that don't need fixing

Not every finding requires a code change. If a finding is acceptable as-is (especially Trivial findings), update the review with a resolution explaining why:

> **Resolution:** Accepted as-is — <reason>.

6. Use sub-agents for each finding

CRITICAL: To preserve context, use a separate sub-agent for each finding. Run sub-agents sequentially, not in parallel — they may modify the same files and will conflict otherwise.

Each sub-agent should:

  • Read the relevant source files
  • Implement the fix
  • Build to verify (when applicable)
  • Update the review document with the resolution blockquote

7. Commit when done

After all findings are addressed, commit the changes with a message like:

chore: Address review findings for <project-name>

What This Step Is NOT

  • Not a second review. The goal is to fix what was found, not to re-review the entire codebase. Don't go looking for new issues.
  • Not optional for Critical/Important findings. Critical findings must be fixed before merging. Important findings should be fixed. Trivial findings are at the user's discretion.
  • Not a rewrite. Make minimal, targeted changes that address each finding. Don't refactor surrounding code.

Ralph — Design Document Guide

Use this document as context when creating a new design specification for a Ralph project.

Design File Location

ralph/projects/<project-name>/design.md

The design document is written before the execution plan. It captures the what and why — architecture, trade-offs, and resolved questions. The plan (plan.md) captures the how — phased steps to implement the design.

Document Structure

1. Header

# <Project Title> — Design Specification

> **Status:** Draft
> **Date:** YYYY-MM-DD
> **Predecessor:** [link to prior design or analysis doc if applicable]

2. Goal

One to three sentences stating what this design achieves and why it matters. Focus on the user-visible or system-level outcome, not implementation details.

3. Current State

Describe the relevant parts of the system as they exist today. Include:

  • What exists: Tables, classes, configurations, data flows involved.
  • Problems: Numbered list of concrete issues this design solves. Each problem should be observable (e.g., "VACUUM causes 30s+ lock holds") rather than abstract (e.g., "the architecture is suboptimal").

Use tables for structured comparisons (e.g., current vs. new settings, table schemas, access patterns).

4. Design Sections

The core of the document. Organize by feature, component, or concern — whatever makes the design easiest to follow. Common patterns:

  • One section per feature when the design covers multiple related changes.
  • One section per component when building a new subsystem (e.g., PartitionManager, PartitionedConnection).
  • One section per concern when cross-cutting topics need dedicated treatment (e.g., locking, migration, startup/shutdown).

Each section should include enough detail for an agent to implement without guessing intent. Specifically:

  • Data flow or lifecycle diagrams in pseudocode or ASCII. Show the sequence of operations, lock acquisitions, and state transitions.
  • Locking/concurrency analysis when the change touches shared state. State which locks are needed, why, and the ordering. Explicitly call out locks that are not needed and why.
  • Edge cases and error handling. What happens when a file is corrupt? When a connection is null? When two operations race?
  • Code sketches for non-obvious implementations — class signatures, SQL statements, PRAGMA sequences. The agent can deviate but needs a starting point.

5. Interaction with Existing Code

When the design modifies existing components, document:

  • What changes in each affected file/method.
  • What stays the same — explicitly noting unchanged behavior prevents the agent from accidentally refactoring working code.
  • Migration path if there's a transition from old to new behavior (data migration, setting migration, rollback).

6. Files Changed

Summary table of all files that will be created or modified:

## Files Changed

| File | Change |
|------|--------|
| `Plugin/Data/Database.cs` | Add new method, update existing method |
| `Plugin/Data/NewClass.cs` | **New file.** Description of purpose |

7. Open Questions

After the first draft, list questions and concerns that need the user's input before the design is finalized. These drive an interactive conversation — the user resolves them, and resolutions are recorded inline.

Use this format:

## Open Questions

1. **Should we migrate existing data?**
   _Resolved:_ No — both tables are ephemeral. Accept data loss on upgrade.

2. **WAL files for read-only partitions?**
   _Resolved:_ Set `PRAGMA journal_mode=OFF` after ATTACH. Safe because
   SQLite replays any unreconciled WAL during the ATTACH operation.

3. **Should the retry limit be configurable?**
   _(Open)_

Guidelines for generating good questions:

  • Surface trade-offs. When there are multiple valid approaches, present the options concisely and ask which the user prefers rather than choosing silently.
  • Flag data loss or breaking changes. Anything that could lose user data or break backward compatibility deserves an explicit question.
  • Call out assumptions. If the design assumes something about behavior, performance, or user expectations that hasn't been confirmed, ask.
  • Don't ask what you can answer. Questions about implementation mechanics (e.g., "does SQLite support X?") should be researched and answered in the design, not deferred to the user.

The design is ready for plan creation when all questions are marked _Resolved_ and their resolutions are reflected in the design sections above.

Writing Guidelines

  • Be specific, not abstract. "Acquire partitionQueryLock, close the connection, delete expired files, rebuild" is better than "clean up and refresh."
  • Show the lock dance. For any operation that touches shared state, write out the lock acquire/release sequence. This is the most common source of bugs and the hardest thing to reconstruct from code alone.
  • Name things early. Decide on method names, property names, and file names in the design. The plan and the agent will use these names directly.
  • Document what you decided not to do. If you considered an approach and rejected it, a brief note prevents the agent from rediscovering and re-evaluating it.
  • Use tables for comparisons. Current vs. new, option A vs. option B, before vs. after — tables make these scannable.
  • Keep the audience in mind. The primary reader is a code agent executing one phase at a time with no memory between iterations. Everything it needs to make correct architectural decisions must be in this document or the plan's observations.

Scope Calibration

Not every feature needs a full design document. Use these guidelines:

  • Full design doc: New subsystems, cross-cutting changes (locking, data model, lifecycle), changes touching 5+ files, anything with concurrency implications.
  • Design section in plan.md: Smaller features where a "Design Decisions" section in the plan provides enough context. The plan's PLAN.md guide describes this option.
  • No design doc: Bug fixes, simple configuration changes, UI-only tweaks with no architectural decisions.
#!/bin/bash
# Ralph Loop — Phase-by-phase plan executor for Claude Code
#
# Usage: ./ralph/loop.sh <project> [max_iterations]
# Examples:
# ./ralph/loop.sh partitioned-history-db # Run until all phases complete
# ./ralph/loop.sh partitioned-history-db 10 # Run at most 10 iterations
#
# Project structure:
# ralph/projects/<project>/design.md — design specification
# ralph/projects/<project>/plan.md — execution plan with phases
# ralph/projects/<project>/review.md — post-execution review (created later)
#
# Branching:
# The loop creates a feature branch (ralph/<project>) and commits each phase
# there. When all phases complete, the branch is merged back to the starting
# branch with --no-ff to preserve the phase-by-phase commit history.
#
# Prerequisites:
# - Claude Code CLI installed and authenticated
# - Project directory exists with at least design.md and plan.md
# - Run from the repository root directory
#
# Safety:
# - Uses --dangerously-skip-permissions (runs in sandbox)
# - Max iterations as a safety net
# - Ctrl+C to abort at any time
set -euo pipefail
# ── Arguments ─────────────────────────────────────────────────
if [ $# -lt 1 ]; then
echo "Usage: ./ralph/loop.sh <project> [max_iterations]"
echo " project: Name of the project folder under ralph/projects/"
echo " Example: ./ralph/loop.sh partitioned-history-db"
exit 1
fi
PROJECT_NAME="$1"
MAX_ITERATIONS=${2:-50}
# ── Configuration ──────────────────────────────────────────────
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
REPO_DIR="$(cd "$SCRIPT_DIR/.." && pwd)"
PROMPT_TEMPLATE="$SCRIPT_DIR/PROMPT.md"
PROJECT_DIR="$SCRIPT_DIR/projects/$PROJECT_NAME"
DESIGN_FILE="$PROJECT_DIR/design.md"
PLAN_FILE="$PROJECT_DIR/plan.md"
LOG_DIR="$SCRIPT_DIR/logs"
SLEEP_BETWEEN=5 # seconds between iterations
# Paths relative to repo root (for the prompt)
REL_DESIGN="ralph/projects/$PROJECT_NAME/design.md"
REL_PLAN="ralph/projects/$PROJECT_NAME/plan.md"
# Exit signals the agent will use
SIGNAL_PHASE_DONE="RALPH_PHASE_COMPLETE"
SIGNAL_ALL_DONE="RALPH_ALL_COMPLETE"
# Branching
FEATURE_BRANCH="ralph/$PROJECT_NAME"
BASE_BRANCH_FILE="$PROJECT_DIR/.base-branch"
# ── Preflight checks ──────────────────────────────────────────
if [ ! -f "$PROMPT_TEMPLATE" ]; then
echo "Error: Prompt template not found at $PROMPT_TEMPLATE"
exit 1
fi
if [ ! -d "$PROJECT_DIR" ]; then
echo "Error: Project directory not found at $PROJECT_DIR"
echo "Create it with design.md and plan.md first."
exit 1
fi
if [ ! -f "$PLAN_FILE" ]; then
echo "Error: Plan file not found at $PLAN_FILE"
echo "Create your plan first, then re-run."
exit 1
fi
if [ ! -f "$DESIGN_FILE" ]; then
echo "Error: Design file not found at $DESIGN_FILE"
exit 1
fi
if ! command -v claude &> /dev/null; then
echo "Error: 'claude' CLI not found. Install Claude Code first."
exit 1
fi
# ── Feature branch setup ─────────────────────────────────────
CURRENT_BRANCH=$(git -C "$REPO_DIR" branch --show-current)
if [ "$CURRENT_BRANCH" = "$FEATURE_BRANCH" ]; then
# Already on the feature branch — resuming a previous run
if [ ! -f "$BASE_BRANCH_FILE" ]; then
echo "Error: On branch '$FEATURE_BRANCH' but .base-branch state file is missing."
echo "Cannot determine which branch to merge back to."
echo "Create the file manually: echo '<base-branch-name>' > $BASE_BRANCH_FILE"
exit 1
fi
BASE_BRANCH=$(cat "$BASE_BRANCH_FILE")
echo "Resuming on feature branch: $FEATURE_BRANCH (base: $BASE_BRANCH)"
elif git -C "$REPO_DIR" rev-parse --verify "$FEATURE_BRANCH" &>/dev/null; then
# Feature branch exists but we're not on it
echo "Error: Feature branch '$FEATURE_BRANCH' already exists but you are on '$CURRENT_BRANCH'."
echo ""
echo " To resume: git checkout $FEATURE_BRANCH"
echo " To start over: git branch -D $FEATURE_BRANCH"
exit 1
else
# Fresh start — create the feature branch
BASE_BRANCH="$CURRENT_BRANCH"
echo "$BASE_BRANCH" > "$BASE_BRANCH_FILE"
git -C "$REPO_DIR" checkout -b "$FEATURE_BRANCH"
echo "Created feature branch: $FEATURE_BRANCH (base: $BASE_BRANCH)"
fi
# ── Setup ──────────────────────────────────────────────────────
mkdir -p "$LOG_DIR"
TIMESTAMP=$(date +%Y%m%d_%H%M%S)
# Build the prompt by substituting project paths into the template
PROMPT=$(sed \
-e "s|{{DESIGN_FILE}}|$REL_DESIGN|g" \
-e "s|{{PLAN_FILE}}|$REL_PLAN|g" \
"$PROMPT_TEMPLATE")
echo ""
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo " Ralph Loop — Phase-by-Phase Executor"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo " Project: $PROJECT_NAME"
echo " Design: $REL_DESIGN"
echo " Plan: $REL_PLAN"
echo " Branch: $FEATURE_BRANCH"
echo " Base: $BASE_BRANCH"
echo " Max: $MAX_ITERATIONS iterations"
echo " Logs: $LOG_DIR/"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo ""
# ── Main loop ──────────────────────────────────────────────────
ITERATION=0
while [ $ITERATION -lt $MAX_ITERATIONS ]; do
ITERATION=$((ITERATION + 1))
LOG_FILE="$LOG_DIR/${PROJECT_NAME}_${TIMESTAMP}_iteration_${ITERATION}.log"
echo "┌──────────────────────────────────────────────"
echo "│ Iteration $ITERATION / $MAX_ITERATIONS"
echo "│ $(date '+%Y-%m-%d %H:%M:%S')"
echo "└──────────────────────────────────────────────"
# Run Claude Code with the prompt
# -p: headless/non-interactive mode
# --dangerously-skip-permissions: autonomous operation (use sandbox!)
# --model: use opus for complex reasoning
# --verbose: detailed logging
OUTPUT=$(echo "$PROMPT" | claude -p \
--dangerously-skip-permissions \
--model claude-opus-4-6 \
--verbose \
2>&1 | tee "$LOG_FILE")
# Check for all-done signal (stop the loop)
if echo "$OUTPUT" | grep -q "$SIGNAL_ALL_DONE"; then
echo ""
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo " All phases complete! Finished after $ITERATION iterations."
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
# Merge feature branch back to base with --no-ff
echo ""
echo " Merging $FEATURE_BRANCH into $BASE_BRANCH..."
PLAN_TITLE=$(head -1 "$PLAN_FILE" | sed 's/^#\+ //' | sed 's/ — .*//')
git -C "$REPO_DIR" checkout "$BASE_BRANCH"
git -C "$REPO_DIR" merge --no-ff "$FEATURE_BRANCH" \
-m "ralph: $PLAN_TITLE"
rm -f "$BASE_BRANCH_FILE"
echo " Merged successfully. Feature branch '$FEATURE_BRANCH' preserved."
echo ""
echo " To delete the feature branch: git branch -d $FEATURE_BRANCH"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
exit 0
fi
# Check for phase-done signal (continue to next)
if echo "$OUTPUT" | grep -q "$SIGNAL_PHASE_DONE"; then
echo ""
echo " Phase complete. Moving to next phase..."
echo ""
else
# Neither signal found — something unexpected happened
echo ""
echo " WARNING: No completion signal detected in output."
echo " Check log: $LOG_FILE"
echo " Continuing to next iteration..."
echo ""
fi
# Brief pause between iterations
sleep $SLEEP_BETWEEN
done
echo ""
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo " Reached max iterations ($MAX_ITERATIONS). Stopping."
echo " Still on feature branch: $FEATURE_BRANCH"
echo " Re-run to continue, or merge manually."
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
exit 0

Ralph — Plan Creation Guide

Use this document as context when creating a new execution plan for the Ralph loop.

Plan File Location

ralph/projects/<project-name>/plan.md

Each project also needs a design.md if there's a standalone design specification. For smaller features, the plan itself may contain all the design context in a "Design Decisions" section.

Plan Structure

Every plan must include these sections in order:

1. Header

# <Project Title> — Execution Plan

> **Design document:** [design.md](./design.md)
> **Predecessor:** [link to prior plan if applicable]
> **Status:** Not started
> **Current phase:** Phase 0

2. How to Use This Plan (copy verbatim)

---

## How to Use This Plan

This plan is designed for the Ralph playbook (shell loop). Each phase:

1. Has **checkboxes** for every discrete task — mark `[x]` when done.
2. Has an **Observations** section — write notes, surprises, or decisions made during that iteration.
3. Is scoped so one phase fits comfortably in a single loop iteration.
4. Includes unit tests for all new logic introduced in that phase.
5. Ends with a **build + test gate** — confirm the project builds and all tests pass before moving on.

**After each loop iteration:** update the "Current phase" field at the top and record observations.

**Build + test gate (mandatory at the end of every phase):**
\`\`\`bash
npm run gulp -- build --agent && npm run gulp -- run:unit --agent
\`\`\`
A phase is **not complete** until both commands succeed and **all** unit tests pass (including pre-existing tests, not just new ones). If any test fails, fix it before marking the phase done. Do not proceed to the next phase with failing tests.

**Important:** When adding new `.cs` files, they must be registered in the `.csproj` as `<Compile Include="...">` entries (legacy-style project). Both `Plugin.csproj` and `UnitTests.csproj` need this.

3. Summary (optional)

Brief 2-3 sentence overview of what the plan accomplishes. Useful when the plan covers multiple related features.

4. Design Decisions / Resolved Questions (optional)

Document key technical choices, trade-offs, and lock analysis. If there's a separate design.md, this section captures decisions made during planning that aren't in the design doc.

5. Phases

Each phase follows this template:

---

## Phase N: <Short Title>

**Goal:** One sentence describing the phase's outcome.

### Tasks

- [ ] **N.1** <Task title>
  - File: `Plugin/Path/To/File.cs`
  - Description of what to do, with code snippets if helpful

- [ ] **N.2** <Next task>
  ...

- [ ] **N.X** Build + test gate: `npm run gulp -- build --agent && npm run gulp -- run:unit --agent` — all tests pass

### Observations

(To be filled during execution)

6. Files Changed Summary

---

## Files Changed Summary

This section tracks all files created or modified across all phases (to be updated during execution).

### New Files
| File | Phase | Purpose |
|------|-------|---------|

### Modified Files
| File | Phases | Changes |
|------|--------|---------|

Phase Design Guidelines

  • One phase = one loop iteration. Each phase must fit within a single Claude Code context window. Err on the side of smaller phases.
  • Build + test gate is the last task in every phase. Never skip it.
  • Tests belong in the same phase as the code they test. Don't defer tests to a later phase.
  • Each phase should leave the codebase in a green state. All existing tests continue to pass, all new code compiles.
  • Task numbering: <phase>.<task> (e.g., 2.3 = Phase 2, Task 3). Helps with referencing in observations.
  • Commit message prefix: Each phase's commit is prefixed with Phase N: (e.g., Phase 0: feat: Add retention cleanup). This is enforced by the PROMPT.md template — no action needed in the plan itself.
  • File paths in tasks: Always include the file path so the agent doesn't have to search for it.
  • Code snippets in tasks: Include implementation sketches for non-obvious logic. The agent can deviate if needed but has a starting point.
  • The final phase should always include a review/verification pass: code review, logging review, design compliance checklist, and a final build + test gate.

What Makes a Good Phase Boundary

Split on these natural boundaries:

  • Interface then implementation: Define the interface/contract in one phase, implement it in the next.
  • Core then consumers: Build the new class/module first, then migrate callers.
  • Infrastructure then features: Test helpers, configuration changes, or new abstractions first.
  • One concern per phase: Don't mix unrelated changes. If a phase touches both the data layer and the UI, consider splitting.

Common Pitfalls

  • Phase too large: If a phase has more than ~10 tasks, it's probably too big. Split it.
  • Missing test tasks: Every phase that adds logic needs test tasks. Don't assume the agent will write tests unprompted.
  • Implicit dependencies: If Phase 2 depends on a specific decision in Phase 1, document it explicitly. The agent has no memory between iterations — only the plan file and observations carry context forward.
  • Forgetting .csproj: New .cs files need <Compile Include> entries in the legacy-style .csproj. Include this as a reminder in the task or as part of the file creation task.

You are executing a phased implementation plan one phase at a time. Each time you are invoked, you complete exactly ONE phase, then stop.

Instructions

1. Read the plan and design document

Read these two files:

  • {{PLAN_FILE}} — the execution plan with phases, tasks, and observations
  • {{DESIGN_FILE}} — the design specification with full architectural details

The plan contains phases (e.g., "Phase 0", "Phase 1"), each with a Tasks section containing checkboxes (- [ ] = pending, - [x] = done) and an Observations section for notes.

2. Read prior observations

Each phase has an Observations section. Read ALL observations from completed phases — they contain discoveries, deviations, and context from prior iterations that you MUST account for.

3. Pick the next phase

Find the first phase that has any unchecked task (- [ ]). This is the phase you will work on. If every task in every phase is checked (- [x]), skip to step 7.

4. Execute the phase

Work through each unchecked task in the phase, in order. Check off each task (- [x]) as you complete it. Follow these rules:

  • Read before writing. Always read existing code before modifying it. Understand what's there.
  • Follow project conventions. Read CLAUDE.md files in relevant directories before making changes.
  • Consult the design document. The design spec in {{DESIGN_FILE}} has detailed implementation guidance — refer to it for architectural decisions, schema details, and code patterns.
  • Build and test. After making changes, build the project to verify compilation. Run unit tests if applicable.
  • One phase only. Do not work on subsequent phases. If you discover work that belongs to a later phase, note it in Observations instead.
  • No placeholders. Implement functionality completely. Stubs and TODOs waste future iterations.

5. Record observations

After completing all tasks in the phase, write notes in that phase's Observations section (replace the <!-- Agent: write notes here during execution --> comment). Include:

  • What was done (brief summary)
  • Any deviations from the plan and why
  • Discoveries that affect future phases
  • Files added or modified

6. Mark phase status and commit

  • Update the Status and Current phase fields at the top of the plan file
  • Verify git is on a named branch (not detached HEAD) — if detached, STOP and do not commit
  • Stage and commit all changes with a descriptive message prefixed with the phase number, e.g.: Phase 0: feat: Add on-demand retention cleanup when DaysToKeep changes
  • Do NOT push to remote

Then output the following signal on its own line:

RALPH_PHASE_COMPLETE

7. All phases done

If every task in every phase is already checked off (- [x]), output:

RALPH_ALL_COMPLETE

Do NOT make any changes. Just output the signal and stop.

Important context

  • Add any project-specific context here (build commands, conventions, files to avoid editing, etc.)

Ralph — Post-Execution Review Guide

Use this document as context when reviewing a completed Ralph plan execution. The review is the quality gate between autonomous execution and merging/deploying.

Review File Location

ralph/projects/<project-name>/review.md

The review is written after all plan phases are complete. It reads the design (design.md), the plan with its observations (plan.md), and the actual code changes to produce a verdict.

How to Request a Review

Point Claude at the completed plan and ask for a review:

Review the changes performed as part of the @ralph/projects/<project-name>/plan.md

Claude will read the design, plan, observations, all changed files, and the git diffs, then write the review to ralph/projects/<project-name>/review.md.

Review Methodology

The reviewer must perform these steps in order:

1. Read the inputs

  • Design document (design.md) — the source of truth for architectural intent, locking analysis, edge cases, and resolved questions.
  • Execution plan (plan.md) — task descriptions, code sketches, and the agent's observations from each phase.
  • All changed files — every file listed in the plan's "Files Changed Summary", both new and modified. Read the full file, not just diffs.
  • Git diffsgit diff <before>...<after> for modified files, to see exactly what changed relative to the pre-project state.

2. Verify design compliance

For each requirement, constraint, or design decision in the design document, verify the implementation matches. Check:

  • Lock ordering and acquisition — are locks acquired in the documented order? Are the right locks used (and unnecessary locks avoided)?
  • Conditional logic — do null checks, state guards, and edge case handling match the design's analysis?
  • Idempotency — are operations that should be idempotent actually safe to call multiple times?
  • Interaction between features — when the design discusses how two features interact (e.g., retention cleanup while suspended), verify both orderings produce correct results.
  • Startup and shutdown — do lifecycle methods handle all states the design describes?

Produce a Design Compliance Checklist table with one row per design requirement.

3. Review the code

Beyond design compliance, review for general code quality:

  • Correctness — logic errors, off-by-one, race conditions not covered by the design.
  • Error handling — are exceptions caught at the right level? Are resources cleaned up in finally blocks?
  • Logging — are log levels appropriate (Info for significant state changes, Debug for routine, Warn/Error for failures)?
  • Security — no SQL injection, no path traversal, no secrets in logs.
  • Consistency — does the new code follow existing patterns in the codebase?

4. Assess test coverage

Produce a Test Coverage Assessment table mapping each feature/behavior to its unit and integration tests. Identify any gaps — features with no test coverage or only partial coverage.

Look for:

  • Happy path — is the primary use case tested?
  • Edge cases — are boundary conditions and error paths tested?
  • Interactions — are feature combinations tested (e.g., both features changing in a single save)?
  • Test quality — do assertions test meaningful state, or are they tautological?

5. Review agent observations

Read the Observations sections in the plan. Look for:

  • Deviations from the plan — did the agent change the approach? Was the deviation justified?
  • Surprises or workarounds — do these indicate a design gap that should be documented?
  • Skipped tasks — was anything marked done but not actually implemented?

Review Document Structure

# <Project Title> — Code Review

> **Design document:** [design.md](./design.md)
> **Plan document:** [plan.md](./plan.md)
> **Reviewer:** <model name>
> **Date:** YYYY-MM-DD
> **Scope:** All changes from commits `<first>` through `<last>` (N commits)
> **Verdict:** <one-sentence summary>

---

## Summary

2-3 paragraphs summarizing what was implemented and whether it matches the design.

---

## Findings

### Critical

Findings that must be fixed before merging. Issues that could cause data loss,
crashes, security vulnerabilities, or fundamentally incorrect behavior.

### Important

Findings that should be fixed but don't block merging. Bugs that only manifest
in edge cases, incorrect behavior that doesn't affect users in practice,
or issues that could cause test flakiness.

### Trivial

Style issues, misleading names, minor code duplication, imprecise log messages.
Fix if convenient, skip if not worth the churn.

---

## Design Compliance Checklist

| Design requirement | Status | Notes |
|---|---|---|
| ... | Correct / Incorrect / Partial | ... |

## Test Coverage Assessment

| Feature | Unit tests | Integration tests |
|---|---|---|
| ... | test name(s) or — | test name(s) or — |

Severity Guidelines

Critical

  • Data loss or corruption under normal usage
  • Deadlock or livelock possibility
  • Security vulnerability (injection, path traversal, credential exposure)
  • Fundamentally wrong algorithm (e.g., lock not acquired where design requires it)
  • Build or test failures not caught by the agent

Important

  • Bug that manifests only under specific timing or edge conditions
  • Missing error handling that could cause an unhandled exception in production
  • Test that passes but doesn't actually verify what it claims
  • Timezone, locale, or platform-specific issues that cause flaky tests
  • Design deviation that wasn't justified in observations

Trivial

  • Misleading variable/method/test names
  • Duplicated helper code across test files
  • Log messages that are slightly inaccurate in edge cases
  • Missing assertions in tests where the core behavior is still verified
  • Minor style inconsistencies with existing code

What the Review Is NOT

  • Not a re-design. The review checks whether the implementation matches the design. If the design itself was wrong, that's a separate conversation — note it as a finding but don't redesign in the review.
  • Not exhaustive fuzzing. The review is a human-level code review, not a formal verification. Focus on the most likely failure modes.
  • Not a blocker for trivial findings. Trivial findings are documented for completeness but don't require action. The user decides what to fix.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment