Turning microscopic heuristic checks into reusable tactics for Agentic Context Engineering.
- Domain heuristics (e.g., ±0.4 % drift, missing
%) act as precision sensors for the Generator → Reflector → Curator loop. - The Reflector distills the failure into a lesson, the Curator stores a typed delta, and the Runtime Adapter replays it immediately.
- Result: the agent cannot repeat the mistake—mirroring the +8.6 % gains ACE reports on FiNER/XBRL benchmarks.
sequenceDiagram
autonumber
participant G as Generator
participant H as Heuristic Check
participant R as Reflector
participant C as Curator
participant RA as Runtime Adapter
participant MC as Merge Coordinator
G->>H: Produce answer & metadata
H-->>G: Flag harmful move (e.g., 0.4% drift)
H->>R: Emit precise failure signal
R->>C: Distill lesson (rounding rule, format fix)
C->>C: Create typed delta (helpful/harmful counters)
C->>MC: Queue merge event
MC->>RA: Surface hot insight immediately
MC->>C: Persist playbook update
RA-->>G: Enriched context for next task
- Detect — The heuristic labels the trajectory as a harmful move.
- Quantify — ACE now has a high-quality failure signal (no noisy feedback).
- Track — Outcomes “track feedback quality,” so precision signals drive precise improvements.
flowchart LR
A[Heuristic Trigger] --> B[Reflector Insight]
B --> C[Curator Delta]
C --> D[Runtime Adapter Cache]
D --> E[Merge Coordinator Flush]
E --> F[Playbook Updated]
F -->|prevents recurrence| G[Future Generator Runs]
- Reflector Insight — “Round to whole percent and append
%.” - Curator Delta — Typed update with helpful/harmful counters.
- Runtime Adapter — Leaks the fix into immediate context before DB flush.
- Finance benchmarks demand exact formatting and rounding.
- Guardrails prevent regressions while preserving helpful history (no context collapse).
- Each delta adds density to the playbook, inching performance toward ACE’s reported uplift.
- Implement the heuristic (numeric tolerance, format clamp, etc.).
- Let the benchmark harness capture
auto_corrections/format_corrections. - Inspect the JSON diff to confirm the guardrail fired.
- Commit the refreshed results so regressions surface in CI.
A small heuristic check isn’t just a patch—it’s a bootstrapped guardrail. ACE’s generator–reflector–curator loop turns that single precision signal into living context that keeps future runs safe without manual babysitting.