Skip to content

Instantly share code, notes, and snippets.

@DMontgomery40
Created March 13, 2026 09:27
Show Gist options
  • Select an option

  • Save DMontgomery40/59f94f0f06da40b7acbd52faa029e6b9 to your computer and use it in GitHub Desktop.

Select an option

Save DMontgomery40/59f94f0f06da40b7acbd52faa029e6b9 to your computer and use it in GitHub Desktop.
Frontend review loop skill bundle
interface:
display_name: "Frontend Code Review"
short_description: "Catch UI bugs, races, and weak frontend tests"
default_prompt: "Use $frontend-code-review to review the frontend diff for concrete UI correctness issues, state or async races, interaction bugs, accessibility regressions, and fake or weak tests."

Browser And Runtime Proof

Netlify / Remote Truth Rule

If the repo contains netlify.toml and the changed behavior depends on Netlify Functions, redirects, auth/cookies, serverless env, deployed routing, or other platform wiring, localhost is not sufficient proof by default.

  • Require at least one deployed preview or production URL verification path when available.
  • Localhost remains acceptable for pure client-rendering, static UI, or isolated component checks that do not depend on Netlify runtime behavior.
  • If a deployed URL is required but not discoverable from the repo or environment, say that local verification is incomplete rather than over-claiming confidence.

Browser / Tool Surface Matrix

  • Repo Playwright CLI: default when the repo already has Playwright config or scripts and the goal is durable regression coverage that should live in the repo or CI.
  • Codex native Playwright MCP: default for quick reproduction, DOM inspection, screenshots, and one-off browser smoke checks during investigation.
  • Playwright via Codemode MCP / code execution: use when the browser flow needs programmable orchestration, loops, or combined automation across multiple tool surfaces.
  • Playwright CLI via Codemode: use when the repo's own Playwright command is still the source of truth, but execution is being orchestrated from a Codemode workflow.
  • macOS automation via Codemode: use only for OS-level gaps that Playwright cannot honestly cover, such as native file pickers, permission dialogs, downloads, app switching, or browser profile handling.

Verification Ordering

Use the lightest honest proof surface first:

  1. local smoke for quick reproduction or static UI validation
  2. repo-native regression command for durable coverage
  3. deployed preview or production verification when platform/runtime behavior matters

Do not sign off on a platform-bound UI or function bug with only component tests or localhost smoke if the real risk lives at the deployed edge.

Test Expansion And Verification

Generalized Test Requirement

When you fix a frontend bug you found during review, the default expectation is bug repro test + category test, not either/or.

  • Start with the nearest existing behavioral suite and extend it before creating a one-off new file.
  • Prefer broader coverage shapes such as invariant, matrix, state-transition, property, idempotency, contract, or multi-case interaction tests when they fit the stack.
  • If the repo is thin on tests, add the smallest honest test in the repo's existing stack rather than introducing a large new framework.
  • A narrow regression-only test is incomplete when the real failure was really about sorting, filtering, pagination, loading state, retries, optimistic updates, cache invalidation, or keyboard and focus state.

Test Placement Heuristic

  • Extend existing shared suites first, especially list/table, form, state, routing, store, or query-behavior suites that already cover nearby interactions.
  • For React or Vite UI logic, prefer existing Vitest, Testing Library, or state/render suites first.
  • For browser flow regressions, prefer the repo's existing Playwright suite first.
  • For Streamlit or dev-tool state bugs, prefer pytest or the existing app-level harness first.
  • Avoid tests that only pin a selector, one literal string, or one callback fire when the real risk is sorting, pagination, retries, loading states, cache invalidation, or state transitions.

Turn-End Verification Gate

For mutating review turns, broader test coverage and verification are part of completion.

  • Before ending the turn, run the narrowest changed-surface tests plus the repo's standard verify, build, or quality-gate command if one exists.
  • If the closest repo AGENTS.md or CLAUDE.md requires a full suite, obey the repo rule instead of this default.
  • Do not treat a fix as done if it only added a hyper-specific regression test where broader category coverage was realistically possible.
  • For review-only turns, still run the narrowest non-mutating validation that proves the finding when feasible.
  • If verification could not be completed, say exactly what was blocked and what remains unverified.
#!/usr/bin/env python3
from __future__ import annotations
import argparse
import json
import subprocess
import sys
from pathlib import Path
TEST_HINTS = (
".test.",
".spec.",
"_test.py",
"test_",
)
VERIFY_SCRIPT_KEYS = ("verify", "build", "lint", "check")
TEST_SCRIPT_KEYS = ("test", "test:e2e", "test:unit", "test:ui", "test:fullstack", "test:mocked")
def run(cmd: list[str], cwd: Path) -> str:
try:
return subprocess.check_output(cmd, cwd=str(cwd), stderr=subprocess.DEVNULL, text=True).strip()
except Exception:
return ""
def git_root(path: Path) -> Path:
root = run(["git", "rev-parse", "--show-toplevel"], path)
return Path(root) if root else path
def load_package_json(path: Path) -> dict:
try:
return json.loads(path.read_text())
except Exception:
return {}
def tracked_tests(repo_root: Path) -> list[str]:
output = run(["git", "ls-files"], repo_root)
if not output:
return []
files = output.splitlines()
return [f for f in files if any(hint in f for hint in TEST_HINTS) or "/tests/" in f or f.startswith("tests/")]
def gather_commands(pkg_path: Path, label: str) -> tuple[list[str], list[str]]:
data = load_package_json(pkg_path)
scripts = data.get("scripts", {})
test_cmds = []
verify_cmds = []
for key, value in scripts.items():
rendered = f"{label}: {key} -> {value}"
if key in TEST_SCRIPT_KEYS or "test" in key:
test_cmds.append(rendered)
elif key in VERIFY_SCRIPT_KEYS or any(token in key for token in VERIFY_SCRIPT_KEYS):
verify_cmds.append(rendered)
return test_cmds, verify_cmds
def pyproject_markers(path: Path) -> list[str]:
if not path.exists():
return []
text = path.read_text(errors="ignore")
markers = []
for token in ("pytest", "streamlit", "playwright", "vitest"):
if token in text:
markers.append(token)
return markers
def main() -> int:
parser = argparse.ArgumentParser(description="Inspect repo review/test surfaces quickly.")
parser.add_argument("repo", nargs="?", default=".", help="Repo path to inspect")
parser.add_argument("--json", action="store_true", help="Emit JSON instead of markdown")
args = parser.parse_args()
start = Path(args.repo).expanduser().resolve()
repo_root = git_root(start)
package_candidates = [
repo_root / "package.json",
repo_root / "frontend" / "package.json",
repo_root / "web" / "package.json",
repo_root / "apps" / "web" / "package.json",
]
package_candidates = [p for p in package_candidates if p.exists()]
test_commands: list[str] = []
verify_commands: list[str] = []
deps: set[str] = set()
for pkg in package_candidates:
label = str(pkg.relative_to(repo_root))
data = load_package_json(pkg)
deps.update(data.get("dependencies", {}).keys())
deps.update(data.get("devDependencies", {}).keys())
tests, verify = gather_commands(pkg, label)
test_commands.extend(tests)
verify_commands.extend(verify)
pyproject = repo_root / "pyproject.toml"
pytest_ini = repo_root / "pytest.ini"
py_markers = pyproject_markers(pyproject)
if pytest_ini.exists() and "pytest" not in py_markers:
py_markers.append("pytest")
has_netlify = (repo_root / "netlify.toml").exists() or (repo_root / "netlify" / "functions").exists()
has_playwright = any((repo_root / name).exists() for name in ("playwright.config.ts", "playwright.config.js", "playwright.config.mjs")) or "@playwright/test" in deps or "playwright" in py_markers
has_vitest = any((repo_root / name).exists() for name in ("vitest.config.ts", "vitest.config.js", "vitest.config.mjs")) or "vitest" in deps
has_jest = any((repo_root / name).exists() for name in ("jest.config.ts", "jest.config.js", "jest.config.mjs")) or "jest" in deps
has_pytest = "pytest" in py_markers
has_streamlit = "streamlit" in py_markers
tracked = tracked_tests(repo_root)
detected_surfaces = []
if has_playwright:
detected_surfaces.append("playwright")
if has_vitest:
detected_surfaces.append("vitest")
if has_jest:
detected_surfaces.append("jest")
if has_pytest:
detected_surfaces.append("pytest")
if has_streamlit:
detected_surfaces.append("streamlit")
notes = []
if has_netlify:
notes.append("Netlify markers detected; deployed preview/prod proof may be required for functions, redirects, auth, or cookie behavior.")
if has_playwright:
notes.append("Repo Playwright CLI is available for durable browser regression coverage.")
if not tracked:
notes.append("Little or no tracked test coverage detected; prefer the thinnest honest addition in the repo's existing stack.")
if not detected_surfaces:
notes.append("No obvious automated test harness detected from common manifests/configs.")
result = {
"repo_root": str(repo_root),
"has_netlify": has_netlify,
"detected_test_surfaces": detected_surfaces,
"test_commands": test_commands,
"verify_commands": verify_commands,
"tracked_test_file_count": len(tracked),
"tracked_test_examples": tracked[:12],
"notes": notes,
}
if args.json:
print(json.dumps(result, indent=2))
return 0
print(f"Repo root: {result['repo_root']}")
print(f"Netlify: {'yes' if has_netlify else 'no'}")
print(f"Detected test surfaces: {', '.join(detected_surfaces) if detected_surfaces else 'none-obvious'}")
print(f"Tracked test files: {len(tracked)}")
if tracked:
print("Tracked test examples:")
for item in tracked[:12]:
print(f"- {item}")
if test_commands:
print("Test commands:")
for item in test_commands:
print(f"- {item}")
if verify_commands:
print("Verify/build commands:")
for item in verify_commands:
print(f"- {item}")
if notes:
print("Notes:")
for item in notes:
print(f"- {item}")
return 0
if __name__ == "__main__":
sys.exit(main())
#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
SKILL_DIR="$(cd "$SCRIPT_DIR/.." && pwd)"
SKILL_NAME="$(basename "$SKILL_DIR")"
CODEX_DIR="${CODEX_SKILL_DIR:-$HOME/.codex/skills/$SKILL_NAME}"
CLAUDE_DIR="${CLAUDE_SKILL_DIR:-$HOME/.claude/skills/$SKILL_NAME}"
usage() {
cat <<EOF
Sync shared review skill files between Codex and Claude skill folders.
Usage:
sync_codex_claude.sh [auto|codex-to-claude|claude-to-codex|status]
Environment overrides:
CODEX_SKILL_DIR default: ~/.codex/skills/$SKILL_NAME
CLAUDE_SKILL_DIR default: ~/.claude/skills/$SKILL_NAME
EOF
}
mtime() {
local path="$1"
if stat -f "%m" "$path" >/dev/null 2>&1; then
stat -f "%m" "$path"
else
stat -c "%Y" "$path"
fi
}
latest_mtime() {
local dir="$1"
local latest=0
while IFS= read -r -d '' file; do
local file_mtime
file_mtime="$(mtime "$file")"
if (( file_mtime > latest )); then
latest="$file_mtime"
fi
done < <(find "$dir" -type f ! -path "*/agents/*" ! -name ".DS_Store" -print0)
echo "$latest"
}
sync_one_way() {
local src="$1"
local dst="$2"
mkdir -p "$dst"
rsync -a --exclude "agents/" --exclude ".DS_Store" "$src/" "$dst/"
}
mode="${1:-auto}"
case "$mode" in
auto|codex-to-claude|claude-to-codex|status) ;;
-h|--help|help)
usage
exit 0
;;
*)
usage >&2
exit 1
;;
esac
mkdir -p "$CODEX_DIR" "$CLAUDE_DIR"
codex_latest="$(latest_mtime "$CODEX_DIR")"
claude_latest="$(latest_mtime "$CLAUDE_DIR")"
if [[ "$mode" == "status" ]]; then
echo "Codex latest mtime: $codex_latest"
echo "Claude latest mtime: $claude_latest"
if (( codex_latest == claude_latest )); then
echo "Status: likely in sync"
elif (( codex_latest > claude_latest )); then
echo "Status: Codex appears newer"
else
echo "Status: Claude appears newer"
fi
exit 0
fi
if [[ "$mode" == "auto" ]]; then
if (( codex_latest == claude_latest )); then
echo "No sync needed: mtimes match."
exit 0
elif (( codex_latest > claude_latest )); then
mode="codex-to-claude"
else
mode="claude-to-codex"
fi
fi
if [[ "$mode" == "codex-to-claude" ]]; then
sync_one_way "$CODEX_DIR" "$CLAUDE_DIR"
echo "Synced Codex -> Claude for $SKILL_NAME"
else
sync_one_way "$CLAUDE_DIR" "$CODEX_DIR"
echo "Synced Claude -> Codex for $SKILL_NAME"
fi
name description
frontend-code-review
Use when the user wants a frontend-focused, bug-first review of UI code or tests, especially in TypeScript or JavaScript React and Vite apps, with light Streamlit coverage. Focus on concrete correctness issues in rendering, state, interactions, async flows, accessibility, visual behavior, and fake or weak frontend tests.

Frontend Code Review

When To Use

Use this skill when the user asks for any of these:

  • a frontend-specific or UI-focused code review
  • a deep review of React, TypeScript, JavaScript, or Vite client code
  • a bug hunt in components, hooks, stores, routes, forms, tables, or interaction flows
  • a review of frontend tests for fake coverage, broken async assertions, or weak browser realism
  • a review of Streamlit UI code used for internal tools or dev dashboards

Do not use this for visual polish alone, backend-only review, or generic architecture discussion.

Outcome

Produce a findings-first review of user-visible correctness, state flow, and interaction behavior.

If you fix findings in-turn, leave behind honest verification and broader coverage for the bug family rather than a single selector-level regression.

Review Target

Default to the most relevant bounded change surface first:

  1. active PR diff, if one exists
  2. current branch diff against the remote default branch
  3. current worktree changes
  4. repo-wide review only when the user explicitly asks for that, or when the changed diff is too small to explain the real UI risk

If the diff is tiny but touches a shared component, hook, store, query layer, router boundary, or table/form primitive used widely across the app, widen the review to that subsystem and its adjacent tests. State that assumption.

Workflow

  1. Inspect the review surface before reading code. Check branch, base branch, diff stats, changed files, route boundaries, shared components, hooks, stores, tests, and browser automation coverage. If the repo shape is unclear, run scripts/review_surface_probe.py <repo-root> first.

  2. Read the highest-risk user flows first. Prioritize tables and lists, sorting and filtering, forms, buttons, dialogs, navigation, auth-gated UI, optimistic updates, file uploads, drag-and-drop, and loading or error states.

  3. Trace the data lifecycle end to end. Follow data from input or URL state to fetch, transform, cache, render, user action, mutation, invalidation, refetch, and visible UI outcome.

  4. Keep going past 3 findings. Continue reviewing until additional passes stop producing concrete new bugs or regressions.

  5. Confirm findings from code and honest runtime behavior. Prefer bugs you can trace through the implementation and realistic UI proof.

  6. Review the tests as part of the implementation. Missing coverage is a finding when a risky interaction, async boundary, or error state is untested.

  7. If you fix a finding in-turn, leave broader behavior coverage behind. The default is bug repro test + category test, extending the nearest shared suite first.

  8. Run turn-end verification before you stop. Use the narrowest changed-surface tests plus the repo's standard verify/build gate when one exists.

What To Look For

  • lists and tables that sort, filter, search, paginate, or populate incorrectly
  • buttons and forms that double-submit, stay disabled, lose state, or report false success
  • stale closures, missing cleanup, or out-of-order async updates
  • optimistic UI or cache invalidation paths that leave visible state wrong
  • loading, error, empty, permission, or offline states that the UI handles badly
  • keyboard, focus, or accessibility regressions that break the main flow
  • browser or layout regressions that only show up in real interaction
  • tests that pass while the real UI is broken because they assert implementation trivia

Output Format

Findings first, ordered by severity.

For each finding:

  • start with a severity tag like [P1]
  • name the concrete user-visible problem
  • explain the trigger, impact, and why it happens
  • cite exact file and line references

After findings, include:

  • any fake or weak frontend tests you fixed immediately, with a short note on what changed
  • brief open questions or assumptions
  • a short note on residual risks or UI paths you were unable to verify

If there are no findings, say No findings. explicitly, then mention any residual risk areas you were unable to exercise.

Severity

  • P1: likely to break a core user workflow, hide or corrupt visible state, create a serious accessibility or security issue, or invalidate a critical frontend feature
  • P2: important correctness, async, interaction, layout, or browser-behavior problem with a realistic failure mode
  • P3: lower-severity correctness issue, sharp edge, or meaningful test gap

References

Load these when needed:

  • references/test-expansion-and-verification.md Use for generalized frontend coverage, suite placement, and turn-end verification expectations.
  • references/browser-and-runtime-proof.md Use when Netlify-backed behavior, deployed truth, browser selection, or macOS automation choices matter.

Operating Rules

  • Prefer concrete UI bugs, regressions, and missing behavior coverage over style commentary.
  • Do not pad the review with weak or duplicate findings just to make it longer.
  • Do not hide additional findings because an internal tool would have stopped at 3.
  • For fake or checkbox tests, prefer fixing them in the same turn over merely filing them as findings.
  • Repo-local instructions override this skill when they are stricter.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment