| name | description | triggers | ||||||
|---|---|---|---|---|---|---|---|---|
pr-review-batch |
IronClaw maintainer PR review -- batch review open PRs against ironclaw project standards (Rust, WASM tools, dual-backend DB, security-first) |
|
Maintainer review workflow for the nearai/ironclaw repository. Optimized for batch review with parallel data fetching, security-first evaluation against IronClaw's Rust/WASM architecture, and structured GitHub review comments.
- Repository: nearai/ironclaw
- Maintainer GitHub: zmanian
- Primary language: Rust (async tokio, wasmtime, axum)
- Key subsystems: WASM tool sandbox, dual-backend DB (postgres + libsql), LLM provider decorator chain, multi-channel system, SKILL.md skills, registry/installer
- CI jobs that matter: Formatting, Clippy (default, all-features, libsql-only, Windows), Regression test enforcement
- CI jobs that DON'T prove much: classify, scope (these always pass, even on fork PRs with no secrets)
Extract PR numbers from the user's message. Accept formats:
- "Review 938, 933, 918"
- "Review #834 and #922"
- "Review all open PRs" (use
gh pr list --state open --limit 30)
For EACH PR, fetch all of these in parallel:
# Metadata: title, author, base/head branch, size
gh pr view <N> --json title,author,state,headRefName,baseRefName,additions,deletions,changedFiles \
--jq '{title, author: .author.login, state, base: .baseRefName, head: .headRefName, additions, deletions, changedFiles}'
# Full diff
gh pr diff <N> --patch
# CI status
gh pr checks <N>
# Previous reviews (for re-reviews)
gh pr view <N> --json reviews --jq '.reviews[] | {author: .author.login, state: .state, body: .body[:200]}'For large diffs (>1000 lines), use gh pr diff <N> --patch | head -500 first, then fetch remaining sections as needed. Note Cargo.lock churn separately -- don't count it as meaningful diff.
Check in this priority order:
- All checks must pass -- not just classify/scope. Must have: Formatting, Clippy (all 3 feature combos), Regression test enforcement.
- Fork PRs (critical gotcha): Only classify/scope run because GitHub Actions secrets aren't available for fork PRs. The PR will APPEAR to have passing checks. Never trust this. Flag it -- local CI verification or maintainer-triggered re-run required before merge.
- Check if zmanian already reviewed -- if so, this is a re-review
- For re-reviews: verify each previous feedback item was addressed, referencing specific commit hashes
- Note reviews from Gemini, Copilot -- cross-reference their findings but don't trust blindly
- Identity file write protection: PROTECTED_IDENTITY_FILES (AGENTS.md, SOUL.md, USER.md, IDENTITY.md) must not become LLM-writable
- Tool approval requirements: ApprovalRequirement changes (Never vs UnlessAutoApproved vs Always) -- especially for tools that cross trust boundaries (tool_install, tool_auth, build_tool, shell)
- WASM sandbox boundaries: fuel limits, memory limits, network allowlists must not be weakened
- Credential handling: no secrets in logs/errors/SSE events; use
redact_params()before broadcast - SSRF vectors: URL validation must resolve DNS before checking for private/loopback IPs
- Prompt injection defense: sanitizer/validator/policy changes in
src/safety/
- No
.unwrap()/.expect()in production code (tests are fine) - String safety: no byte-index slicing (
&s[..n]) on user/external strings -- useis_char_boundary()orchar_indices() - Dual-backend DB: new persistence features must support BOTH postgres AND libsql. Check for missing trait implementations.
- Feature flags: changes must compile under
--no-default-features --features libsql, default, and--all-features - Transaction safety: multi-step DB operations wrapped in transactions (both backends)
- LLM provider decorator chain: new
LlmProvidertrait methods must be delegated in ALL wrapper types (grepimpl LlmProvider for)
crate::for cross-module imports (notsuper::except tests and intra-module)thiserrorfor error types inerror.rs; map errors with context via.map_err()- Strong types over strings (enums, newtypes)
- Module specs followed -- if a module has a CLAUDE.md (agent, web, db, llm, setup, tools, workspace), check it
- Module-owned initialization: init logic lives in owning module as public factory fn, not in main.rs/app.rs
- No unnecessary dependencies (check
~/.claude/approved-dependencies.mdlist)
- Bug fixes MUST have regression tests (enforced by CI regression-check job and commit-msg hook)
- Tests use
tempfilecrate, not hardcoded/tmp/paths - No real network requests in tests (use mocks or RFC 5737 TEST-NET IPs like 192.0.2.1)
- Test names and comments match actual test behavior and assertions
[skip-regression-check]in commit message or PR label only if genuinely not feasible
| Verdict | Criteria |
|---|---|
| APPROVE | Clean, follows IronClaw patterns, full CI green, no security issues, tests present |
| REQUEST CHANGES | Security regressions, functional bugs, .expect() in production, trust boundary violations, missing dual-backend support, missing error handling |
| COMMENT | Good direction but needs discussion, or already approved with observations |
Post reviews via gh pr review using HEREDOC for body formatting.
## Review: <short summary of what PR does>
<1-2 sentence assessment>
Positives:
- <what works well>
- <pattern compliance>
### <Severity>: <issue title>
<Detailed explanation>
### <Severity>: <issue title>
<Detailed explanation>
Minor notes:
- <non-blocking observation>
<Concrete suggestion if requesting changes>
Severity levels: Critical, Concerning, Minor (non-blocking)
## Re-review: <status summary>
All/N items from my previous review have been resolved:
1. **<item>** -- Fixed in commit <hash>. <What changed>.
2. **<item>** -- Fixed. <Details>.
<Additional observations if any>
LGTM.
GitHub 502s are common during batch posting. Retry with sleep 5 between attempts. Post reviews sequentially (not in parallel) to avoid rate limits.
After all reviews are posted, provide a summary table:
| PR | Title | Verdict |
|----|-------|---------|
| #938 | Z.AI provider | Approved |
| #933 | channels list CLI | Approved |
| #918 | skills CLI | Approved |
Note cross-PR conflicts (e.g., PRs that both modify src/cli/mod.rs and snapshot files).
Only classify/scope CI jobs run. Never merge with only these passing. Either:
- Run local CI:
cargo check --all-features && cargo clippy --all && cargo test - Or trigger full CI by pushing a maintainer commit to the PR branch
- Verify artifact URLs match the naming convention:
<kind>-<name>-<version>-wasm32-wasip2.tar.gz - Check SHA256 checksums against actual release assets
- Ensure
namefield in manifest matches crate_name in source config - Cross-reference with
.github/workflows/release.ymlfor automated patching
When a contributor PR has too many issues:
- Create new branch from staging
- Cherry-pick or apply the contributor's changes
- Fix the issues
- Create superseding PR referencing the original
When PRs are related (e.g., all touch registry manifests, or both modify cli/mod.rs), post context comments on each explaining how they fit together and merge ordering.
When the user says "merge" after reviews, use gh pr merge <N> --squash for each approved PR. Verify CI is still green before each merge.