Date: 2026-02-25 Agents analyzed: 10 (all claude-haiku-4-5-20251001) Tasks: timelock-refund, hash-lock, escrow-timeout, vault-clawback, batch-payments, multisig-2of3, oracle-payout, impossible-recurring, prediction-market, atomic-swap
Task outcomes: 9 of 10 agents completed their task (at least Tier 2). 1 agent (hash-lock) did not complete. Only 1 agent (atomic-swap) reached Tier 3. The remaining 8 completed agents reached Tier 2, with 3 of those explicitly blocked from Tier 3 by environmental or library constraints rather than language issues.
Tier 1 (SimplicityHL compilation) was achievable by all 10 agents, though compile attempt counts varied significantly: escrow-timeout compiled on the first try (1 attempt), while oracle-payout required 7 attempts. The median was 3 attempts.
Aggregate behavioral metrics: All behavioral metrics (compile_success_rate, doc_reliance_ratio, reads_by_category) are null/empty across all 10 agents. The metrics extraction pipeline produced no usable objective data, so this report relies entirely on self-reported feedback, cross-correlated across agents for reliability.
Overall DX assessment: SimplicityHL is learnable from the existing documentation and examples, but has significant friction points around (1) a sandbox environment issue that blocked 7/10 agents from full Tier 2/3 completion, (2) undocumented or poorly documented type system behaviors that Rust developers expect to "just work," and (3) a thin rust-simplicity integration layer that lacks end-to-end examples. The language's deviation from Rust conventions without clear documentation of those deviations is the primary source of developer confusion.
| Metric | Value |
|---|---|
| Tasks attempted | 10 |
| Tier 1 reached | 10/10 (100%) |
| Tier 2 reached | 10/10 (100%) |
| Tier 3 reached | 1/10 (10%) |
| Median compile attempts to Tier 1 | 3 |
| Agents blocked by cargo permissions | 7/10 (70%) |
| Mean self-reported feedback items per agent | 3.5 |
Description: Agents could not download crates from crates.io due to permission errors on /usr/local/cargo/registry/cache/. This completely blocked Tier 2/3 progress for most agents, forcing them to write dependency-free Rust scaffolding instead of using the simplicity-lang crate.
Agents affected:
- batch-payments: "Permission denied accessing /usr/local/cargo/registry/cache/"
- hash-lock: "failed to open .../hex-0.4.3.crate: Permission denied (os error 13)"
- timelock-refund: "Permission denied (os error 13) on /usr/local/cargo/registry/cache"
- impossible-recurring: "failed to open .../base64-0.21.7.crate - Permission denied"
- vault-clawback: "Permission denied when accessing cargo registry cache"
- escrow-timeout: "failed to open .../simplicity-lang-0.6.0.crate - Permission denied (os error 13)"
- multisig-2of3: Cargo dependency resolution issues
Impact: This is an environment issue, not a language issue, but it was the single largest blocker across the evaluation. Only timelock-refund found a workaround (CARGO_HOME=/tmp/.cargo). The other 6 agents abandoned the simplicity-lang dependency entirely.
Remediation:
- Fix sandbox permissions: ensure
/usr/local/cargo/registry/cache/is writable, OR - Pre-vendor dependencies: include
simplicity-langand its transitive deps in the sandbox, OR - Document the
CARGO_HOME=/tmp/.cargoworkaround prominently
Description: Even agents that could theoretically use rust-simplicity found no documentation or examples showing the full pipeline: decode a compiled program, attach witness data, execute via BitMachine, and verify. This blocked Tier 3 completion for every agent except atomic-swap.
Agents affected:
- hash-lock (severity 4): "lacks higher-level convenience functions for common workflows"
- impossible-recurring (severity 4): "no public API for constructing Simplicity transaction environments or providing witness data"
- escrow-timeout (severity 3): "No clear example of how to use rust-simplicity to construct witness data from compiled programs"
- hash-lock (severity 2): "The distinction between ConstructNode, CommitNode, and RedeemNode is clear conceptually, but examples don't show the conversion pipeline"
Remediation:
- Add an end-to-end example to rust-simplicity showing: (1) decode base64 program to CommitNode, (2) attach witness Values to create RedeemNode, (3) execute on BitMachine, (4) verify output
- Consider adding a high-level convenience function like
execute_with_witness(program_b64, witness_map, env) -> Result<Value>as suggested by hash-lock agent
Description: Jets like current_amount() and output_amount() return (Asset1, Amount1) where Asset1 = Either<(u1, u256), u256> and Amount1 = Either<(u1, u256), u64>. Agents expected simple u64 returns and had to source-dive into compiler code (src/types.rs) to understand the actual types.
Agents affected:
- impossible-recurring (severity 3): "Expected expression of type
u64, found type(Either<(u1, u256),u256>, Either<(u1, u256),u64>)" — had to read compiler source - batch-payments (severity 1): "docs mention output_amount() but don't clearly explain the Asset1/Amount1 Either type structure"
- batch-payments (severity 2): "Transaction output introspection is limited"
Evidence: impossible-recurring explicitly states: "Read compiler source code (src/types.rs) to understand that Amount1 = Either<(u1, u256), u64>". This is a clear documentation gap forcing source-diving.
Remediation:
- Add a dedicated documentation page for "Working with Transaction Introspection Types"
- Include a complete example contract showing how to extract explicit amounts from
output_amount()returns with proper match expressions - Add type signatures to the jets.md documentation for every jet that returns complex types
Description: SimplicityHL does not support Rust-style .0/.1 tuple field access or inline type annotations in destructuring patterns. Agents familiar with Rust expected these to work.
Agents affected:
- impossible-recurring (severity 3): "neither '.1' indexing nor inline type annotations in patterns work" — tried
pair.1andlet (asset: Asset1, amount: Amount1) = pair - batch-payments (severity 2): Tried using blocks within match arms, causing grammar errors
Remediation:
- Document tuple handling explicitly in the language reference with a "Differences from Rust" callout
- Show the correct pattern:
let (asset, amount): (Asset1, Amount1) = pairinside match arms - Consider adding
.0/.1tuple access as a language feature if feasible
Description: Agents expected match to work on integer types (u1, u8) like Rust, but SimplicityHL only supports matching on bool, Option, and Either.
Agents affected:
- oracle-payout (severity 2): "Attempted to match on u1 literals like '0u1 =>' and '1u1 =>'" — exact error: "Expected 'something else', found '0'"
- prediction-market (severity 1): "Cannot directly match on u8 values using pattern matching"
- impossible-recurring (severity 2): Boolean negation with
!not supported
Remediation:
- Add a "Supported Match Types" section to match_expression.md explicitly listing what can be matched (bool, Option, Either) and what cannot (integers)
- Provide idiomatic workaround patterns: casting
u1toboolvia<u1>::into()or usingjet::eq_8()foru8
Error: Expected 'something else', found '0'
Context: oracle-payout trying match outcome { 0u1 => ... }
Problem: "Expected 'something else'" gives no indication that integer matching is unsupported or what to do instead.
Suggested improvement: Pattern matching on integer types is not supported. Convert to bool first using <u1>::into(value), or use jet::eq_N() for comparison.
Error: Expected ':', found ')'
Context: batch-payments trying Left(_) pattern
Problem: The error suggests a syntax issue when the real problem is that _ wildcards are unsupported.
Suggested improvement: Wildcard patterns '_' are not supported in SimplicityHL. Use a named binding instead, e.g., Left(value: Type).
Error: Witness expressions are not allowed outside the 'main' function
Context: vault-clawback accessing witness::HOT_KEY in a helper function
Problem: This error message is actually good and clear. No change needed — included here as a positive example.
Error: Expected expression of type 'u64', found type '(Either<(u1, u256),u256>, Either<(u1, u256),u64>)'
Context: impossible-recurring assigning jet::current_amount() to u64
Problem: The error is technically accurate but doesn't help the developer understand what Asset1/Amount1 are or how to extract the value they need.
Suggested improvement: Append a hint: Tip: current_amount() returns (Asset1, Amount1). Use match to extract explicit values. See: [link to introspection docs]
Error: Cannot parse: found '.' expected ...
Context: impossible-recurring trying pair.1
Problem: Doesn't indicate that tuple field access is unsupported.
Suggested improvement: Tuple field access (e.g., pair.1) is not supported. Use let (a, b): (TypeA, TypeB) = pair to destructure.
| Agent | Task | Compile Attempts | Key Error Category |
|---|---|---|---|
| escrow-timeout | escrow-timeout | 1 | None (first-try success) |
| hash-lock | hash-lock | 1 | None |
| timelock-refund | timelock-refund | 2 | Minor syntax |
| batch-payments | batch-payments | 3 | Match expression syntax |
| multisig-2of3 | multisig-2of3 | 3 | Cargo/dependency naming |
| prediction-market | prediction-market | 3 | Witness file format |
| atomic-swap | atomic-swap | 3 | Args file format |
| vault-clawback | vault-clawback | 3 | Witness scope restriction |
| impossible-recurring | impossible-recurring | 6 | Type system + tuple handling |
| oracle-payout | oracle-payout | 7 | Pattern matching + type casting |
Agents with the most compile failures (oracle-payout: 7, impossible-recurring: 6) hit the worst combination of undocumented type system behaviors: integer pattern matching, type casting syntax, tuple destructuring, and boolean operations. These are all areas where SimplicityHL diverges from Rust without clear documentation.
Impact: 3 agents (impossible-recurring, batch-payments x2) struggled with Asset1/Amount1 types.
Evidence: impossible-recurring had to read compiler source code (src/types.rs) to understand the types. batch-payments couldn't find examples of output_amount() usage.
What's needed: A dedicated page "Working with Transaction Amounts and Assets" showing:
- Type definitions for Asset1, Amount1
- Complete match expression example extracting explicit amounts
- Handling confidential vs explicit values
Impact: 4 agents (hash-lock, impossible-recurring, escrow-timeout, batch-payments) could not figure out the CommitNode -> RedeemNode -> BitMachine pipeline.
Evidence: hash-lock searched node/commit.rs and node/redeem.rs source code. escrow-timeout found no examples. impossible-recurring reported no public API for witness construction.
What's needed: Step-by-step guide: decode base64 -> CommitNode -> attach witness -> RedeemNode -> execute on BitMachine. Minimum viable example with real code.
Impact: 3 agents (hash-lock, prediction-market, vault-clawback) were confused by witness handling. Evidence:
- hash-lock: "Documentation mentioned witness(name) but examples use witness::NAME"
- prediction-market: "Initially unclear about witness file format for simc compiler"
- vault-clawback: "witness access restricted to main() function" (resolved but caused a compile failure) What's needed:
- Standardize documentation to use
witness::NAMEsyntax consistently - Add a "Witness Data" reference page showing: (1) in-contract syntax, (2) .wit file format with examples of complex nested types, (3) the main()-only restriction with the workaround pattern of passing values as parameters
Impact: 2 agents (oracle-payout, atomic-swap) had to discover args file JSON format through trial and error. Evidence:
- oracle-payout: "Had to discover that --args parameter expects JSON format, not SimplicityHL syntax"
- atomic-swap: "simc compiler requires explicit parameter values (via --args file) even when they are not directly used as witness data"
What's needed: Document the
--argsfile JSON format in cli.md with a complete example. Show the relationship betweenparam::NAMEin contract code and{"NAME": {"value": "...", "type": "..."}}in the args file.
Impact: 3 agents (oracle-payout, prediction-market, impossible-recurring) tried matching on integers. What's needed: Explicit list of matchable types in match_expression.md with workaround patterns for integers.
Current state: context.md mentions witness(name) syntax; examples use witness::NAME.
Agents confused: hash-lock
Suggested fix: Audit all documentation for witness(name) references and update to witness::NAME. Add a canonical example in the witness section of the language reference.
Current state: Type casting documentation exists but the syntax <InputType>::into(value) (not <OutputType>::into()) is counterintuitive for Rust developers.
Agents confused: oracle-payout — tried <bool>::into(outcome) instead of <u1>::into(outcome)
Suggested fix: Add a prominent callout: "Note: the type annotation specifies the input type, not the output type. <u1>::into(x) casts a u1 to its target type." Include a quick-reference casting table with common conversions.
Current state: Both types exist but the documentation doesn't clearly explain when to use which. Agents confused: timelock-refund, atomic-swap (both severity 1) Suggested fix: Add a comparison table to the timelocks section:
| Type | Jet | Use Case |
|---|---|---|
| Height | check_lock_height() |
Absolute block height (e.g., "after block 800000") |
| Distance | check_lock_distance() |
Relative blocks since UTXO confirmation (e.g., "after 144 blocks") |
Current state: No ! operator; must use jet::complement_1 or a helper function.
Agents confused: impossible-recurring — tried assert!(!borrow)
Suggested fix: Add a "Boolean Operations" section to the operators reference showing that ! is not supported and the idiomatic alternative is jet::complement_1(value) or a not() helper function.
| Expected (from Rust experience) | Reality in SimplicityHL | Documented? | Agents Affected |
|---|---|---|---|
match x { 0 => ..., 1 => ... } on integers |
Only bool/Option/Either can be matched | No | oracle-payout, prediction-market, impossible-recurring |
_ wildcard in patterns |
Not supported; must use named bindings with types | No | batch-payments |
!value boolean negation |
Not supported; use jet::complement_1 |
No | impossible-recurring |
tuple.0, tuple.1 field access |
Not supported; must destructure with let |
No | impossible-recurring |
witness::X usable anywhere |
Restricted to main() function only |
Partially (error message is clear, but docs don't state this upfront) | vault-clawback |
jet::current_amount() returns u64 |
Returns (Asset1, Amount1) — complex Either types |
No (type aliases exist in type_alias.md but not linked from jets.md) | impossible-recurring, batch-payments |
--args file uses SimplicityHL syntax |
Uses JSON format | No | oracle-payout, atomic-swap |
Blocks { ... } in match arms |
Match arms must be single expressions | Partially (examples show this but it's not stated explicitly) | batch-payments |
Key finding: 6 of 8 expectation gaps are completely undocumented. SimplicityHL borrows Rust-like syntax, which creates implicit expectations. Every deviation from Rust behavior needs explicit documentation.
Ranked by impact, considering cross-agent corroboration.
- Effort: Small
- Impact: High
- Evidence: 7/10 agents blocked. This single fix would have unblocked Tier 2/3 for most agents.
- Action: Make
/usr/local/cargo/registry/cache/writable, or pre-vendorsimplicity-langand its dependencies.
- Effort: Medium
- Impact: High
- Evidence: 6/10 agents hit issues where SimplicityHL diverges from Rust (pattern matching,
!operator, tuple access, wildcards, match arm blocks). Agents with the highest compile attempt counts (oracle-payout: 7, impossible-recurring: 6) were hit hardest by these undocumented differences. - Action: Create a dedicated page listing every deviation from Rust syntax/semantics with the correct SimplicityHL alternative. This single page would address the majority of confusion points.
- Effort: Medium
- Impact: High
- Evidence: 4/10 agents could not figure out the CommitNode -> RedeemNode -> BitMachine pipeline. This was the primary Tier 3 blocker after the cargo permissions issue.
- Action: Add a complete, runnable example showing: load base64 program, construct witness, create RedeemNode, execute, verify.
- Effort: Small
- Impact: High
- Evidence: 3/10 agents struggled with Asset1/Amount1. One had to read compiler source code. This caused the highest-severity self-reported confusion (severity 3).
- Action: Add a "Transaction Introspection" page with Asset1/Amount1 type definitions, match expression examples, and explicit vs. confidential value handling.
- Effort: Medium
- Impact: High
- Evidence: Agents that hit unsupported features (integer matching, wildcards,
!operator, tuple access) got generic parse errors that gave no guidance. oracle-payout (7 attempts) and impossible-recurring (6 attempts) show the cost. - Action: Add specific error messages for: (a) integer patterns in match, (b)
_wildcards, (c)!operator, (d).Ntuple access. Each should suggest the correct alternative.
- Effort: Small
- Impact: Medium
- Evidence: 3/10 agents confused by witness syntax, file format, or scope restrictions.
- Action: Create a "Witness Data" reference page covering:
witness::NAMEsyntax (standardize across all docs),.witfile format with complex type examples, and the main()-only restriction with the parameter-passing workaround pattern.
- Effort: Small
- Impact: Medium
- Evidence: 2/10 agents had to discover args file format through trial and error. oracle-payout initially created a SimplicityHL-syntax file.
- Action: Add
--argsfile format documentation to cli.md with a complete example showing the JSON structure and its relationship toparam::NAMEdeclarations.
- Effort: Small
- Impact: Medium
- Evidence: 3/10 agents tried integer matching; 1 tried wildcards; 1 tried blocks in arms.
- Action: Add a "Limitations" section to match_expression.md listing unsupported patterns and idiomatic workarounds (cast to bool, use
jet::eq_N, named bindings with types, helper functions).
- Effort: Medium
- Impact: Medium
- Evidence: 2/10 agents noted lack of examples using
param::syntax. atomic-swap: "No example showing a contract with parameterized values for production use." - Action: Add at least one example contract that uses
param::with an.argsfile, ideally a parameterized version of htlc.simf.
- Effort: Large
- Impact: Medium
- Evidence: 2/10 agents (both severity 4) called for convenience functions. hash-lock suggested
execute_with_witness(program_b64, witness_map, env). - Action: Add a
simplicity::easyorsimplicity::quickstartmodule with high-level functions that abstract the ConstructNode -> CommitNode -> RedeemNode -> BitMachine pipeline for common use cases.
All 10 agents returned null/zero values for every behavioral metric (compile_invocations, compile_successes, compile_failures, compile_success_rate, reads_by_category, total_resource_reads, doc_reliance_ratio, input_tokens, output_tokens, model, cost_usd). The metrics extraction pipeline did not capture any data. Consequently:
- compile_success_rate comparisons are unavailable. Self-reported compile attempt counts were used as a proxy.
- doc_reliance_ratio cannot be calculated. The recommendation to flag agents below 0.5 cannot be executed. Source-diving behavior was inferred from self-reports (e.g., impossible-recurring reading
src/types.rs). - reads_by_category is empty for all agents, preventing objective identification of documentation vs. source code reading patterns.
Recommendation: Fix the behavioral metrics extraction pipeline before the next evaluation round. These metrics are essential for validating self-reported feedback and identifying silent struggles that agents don't report.