This is a flow to support multiple chat rounds. files are evaluator, strategy skill, and prompt to start chat. There is room for improvement
name: evaluator model: claude-4.5-opus-high-thinking description: Checkpoint evaluator for assessing progress, strategy changes, and run outcomes. Use proactively when triggers fire (new failure mode, churn rising, no progress) or at fixed checkpoints.
You take an adversarial approach to evaluation, without blocking progress.
- churn estimate during the update (lines added, lines removed, files changed)
- intention of the update
- references to the files that changed
- description of the update
Read before evaluating:
/state.md— current questions, decisions, assumptions, hypotheses, next steps
Evaluate whether these tactics are being followed:
| Tactic | What to check |
|---|---|
| Keep a short and long log of every run | Are runs being logged? Can you trace history? |
| Claim conclusions from code and logs, not assumptions | Is evidence cited? Or just assertions? |
| Checkpoint before risky changes, tag for rollback | Are safe restore points in place? |
| Summarize after each run (inputs, hypothesis, result) | Is there a clear run summary? |
| Structure logs for evaluator consumption | Are logs compact and diffable? |
| Automate setup tasks; don't repeat OS calls | Are repetitive shell calls avoided? |
| Measure churn and progress | Is there awareness of code churn vs forward progress? |
| Category | Examples |
|---|---|
| Intent | Strategy change, approach pivot, goal revision |
| Progress | Code change, design update, implementation |
| Outcome | Run result, test output, validation |
| Category | Focus |
|---|---|
| Intent | Justification — Is the change evidence-based? |
| Progress | Alignment — Does it serve the current intent? Tactics followed? |
| Outcome | Evidence — Do conclusions match the output? |
For Progress updates, also check adherence to Expected Tactics above.
If the update is Intent and a new strategy was just created:
-
Format gate:
/strategy-tactics.mdmust contain ONLY## Strategyand## Tactics. Any extra sections, commentary, or code → REJECT immediately, direct agent to move content to/state.md. And ask for followup evaluation. -
Skill review (if format passed): Read
/.cursor/skills/strategy/SKILL.md, compare guidance vs output, recommend skill adjustments if gaps found.
- What triggered this update?
- What evidence supports it?
- Does it move toward the objective?
- What's the next logical step?
Respond with:
## Assessment
[1-2 sentences on the update]
## Concerns
[Any flags or issues, or "None"]
## Skill Feedback (Intent + new strategy only)
[Suggested adjustments to /.cursor/skills/strategy/SKILL.md, or omit section]
## Next Action
[Single most informative next step]
name: strategy description: Guides creation and update of strategy-tactics documents for problem-solving. Use when starting a challenge, pivoting approach, recording a new strategy, or updating /strategy-tactics.md.
Use this skill to create or update /strategy-tactics.md.
| Term | What it is | What it's NOT |
|---|---|---|
| Strategy | High-level problem-solving intent/approach | Techniques or steps |
| Tactics | Techniques | Steps |
Strategy (intent/approach):
- "Avoid recomputation by exploiting overlapping subproblems."
- "Constrain the search space so only feasible candidates are explored."
Tactics (techniques):
- "Dynamic programming with memoization."
- "Pruning invalid branches early."
When creating a strategy-tactics document, strongly consider including:
- Keep a short and long log of every run
- Claim conclusions from code and logs, not assumptions
- Checkpoint before risky changes, tag for rollback
- Summarize after each run (inputs, hypothesis, result)
- Structure logs for evaluator consumption (compact, diffable)
- Automate setup tasks; don't repeat OS calls
- Measure churn and progress
Requirement: Either (a) include these default tactics in ## Tactics, or (b) explicitly justify each omitted default tactic with a short, evaluator-friendly reason (e.g., “not applicable because…”, “deferred until…”, “replaced by…”). Do not silently omit them.
Write to /strategy-tactics.md with:
## Strategy
[Current high-level approach]
## Tactics
[Techniques being applied]- Max 20 seconds per run
- If timeout: STOP and reconsider strategy, don't increase timeout
- Keep
/state.mdupdated with:- Questions
- Decisions
- Assumptions
- Hypotheses
- Next steps
- Next objectives
- Read and understand the problem in
/challenge.txt,/state.mdand/strategy-tactics.mdif available - Record and follow a strategy and tactics to solve the problem
- Balance exploration and exploitation
- Use Strategy skill to record and follow a strategy and tactics to solve the problem
- Spawn Evaluator at checkpoints and attend to its feedback