-
-
Save denniswon/4966a38fb0bca88e20d7270f8f569736 to your computer and use it in GitHub Desktop.
| Here's my complete analysis: | |
| Part 1: Why blocknumber - 1 Exists | |
| The BLS verification chain requires this: | |
| Gateway Contract (createTask) Contract (validateAttestationDirect) | |
| │ │ │ | |
| │ current_block = 10433577 │ │ | |
| │ task_created_block = 10433576 │ │ | |
| │ (-1 offset for Sepolia) │ │ | |
| │ │ │ | |
| │──── createTask(taskCreatedBlock │ │ | |
| │ = 10433576) ──────────────────▶│ │ | |
| │ │ require(10433576 < block.number) │ | |
| │ │ require(block.number - 10433576 │ | |
| │ │ <= bufferWindow) │ | |
| │ │ ✓ stores task with block 10433576 │ | |
| │ │ │ | |
| │ │ Client calls ───────▶│ | |
| │ │ validateAttestationDirect | |
| │ │ │ | |
| │ │ checkSignatures( │ | |
| │ │ message, │ | |
| │ │ quorumNumbers, │ | |
| │ │ 10433576, │ ← task.taskCreatedBlock | |
| │ │ nonSignerStakes) │ | |
| │ │ │ | |
| │ │ require(10433576 │ | |
| │ │ < block.number) │ ← STRICT LESS-THAN | |
| │ │ ✓ passes │ | |
| Without the offset: If the gateway used current_block (10433577) as taskCreatedBlock, and validateAttestationDirect mined in | |
| block 10433577, then checkSignatures would see 10433577 < 10433577 → false → BlockNumberTooRecent revert. | |
| The offset creates a time buffer. For Sepolia (12s blocks), -1 gives ~12s of safety. For Base (2s blocks), -5 gives ~10s. | |
| ★ Insight ───────────────────────────────────── | |
| This is architecturally necessary because: | |
| 1. The gateway must tell operators what block to use for BLS key lookups BEFORE createTask mines | |
| 2. The contract validates the block is in the past but doesn't modify it | |
| 3. Any subsequent verification (respondToTask or validateAttestationDirect) also needs referenceBlockNumber < block.number | |
| 4. The offset ensures this invariant holds even if transactions mine quickly | |
| The fundamental tension: the BLS aggregation block must be decided before any transaction mines, but verification requires | |
| it to be strictly in the past at verification time. | |
| ───────────────────────────────────────────────── | |
| Part 2: What's Actually Still Failing? | |
| After PR #435, the Sepolia JSON shows aggregation_response.task_created_block: 10433576 == task.taskCreatedBlock: 10433576. | |
| The blocks match. The signature_data appears to be present. | |
| If validateAttestationDirect is still reverting, there are only these remaining possibilities: | |
| 1. The deployment hasn't fully propagated — the gateway serving these requests might still be on old code | |
| 2. The NonSignerStakesAndSignature indices are stale — the AvsRegistryServiceChainCaller cache might serve wrong data | |
| 3. The BLS aggregated signature itself is invalid — the operator set at the aggregation block differs from what the contract | |
| sees | |
| 4. The respondToTask background task is what's failing (not validateAttestationDirect) — the "unexpected block mismatch" | |
| warning fires and update_response_indices can't fix the immutable BLS signature | |
| To diagnose which of these is the actual issue, I need to know: is the error from validateAttestationDirect (client-side) or | |
| from the background respondToTask? Both paths use BLS verification. The Datadog logs should show either: | |
| - "unexpected block mismatch detected - updating indices" → background respondToTask path | |
| - A client-side revert → validateAttestationDirect path | |
| Part 3: The Fundamental Fix | |
| You asked for "much more fundamental fix." The root architectural problem is: | |
| The gateway estimates a block, uses it for aggregation, THEN submits it to the chain. If the chain assigns a different view | |
| of the block, or if the operator registry cache serves stale data, things break. | |
| Here's the fundamental fix — make the offchain block the single, uncontested source of truth throughout the entire pipeline, | |
| and add a debug assertion to catch any drift: | |
| ⏺ Read 1 file (ctrl+o to expand) | |
| ⏺ Now I have all the context. Let me provide the comprehensive answer and fundamental fix proposal. | |
| --- | |
| Part 1: Why blocknumber - 1 When Creating Tasks Offchain | |
| ★ Insight ───────────────────────────────────── | |
| The BLS verification constraint chain: | |
| EigenLayer's BLSSignatureChecker.checkSignatures() enforces: | |
| require(referenceBlockNumber < block.number); // STRICT less-than | |
| The referenceBlockNumber = task.taskCreatedBlock. This block is used to look up the operator set state (which BLS keys were | |
| registered, what their stakes were) at that specific point in time. | |
| Why the gateway must estimate this block BEFORE mining: | |
| 1. Gateway decides task_created_block = current_block - offset at the start of newt_createTask | |
| 2. Passes it to BLS aggregation (operators sign using this block's operator set) | |
| 3. Passes it to createTask on-chain (contract validates: > 0, < block.number, within buffer window) | |
| 4. Client later calls validateAttestationDirect which runs checkSignatures(msg, quorums, task.taskCreatedBlock, | |
| signatureData) — the < block.number check must hold | |
| Without the offset, if the gateway used current_block and validateAttestationDirect mined in that same block: N < N = false | |
| → BlockNumberTooRecent revert. | |
| Chain-specific offsets: | |
| - Sepolia/Ethereum (12s blocks): -1 = ~12s buffer | |
| - Base (2s blocks): -5 = ~10s buffer (the block cache can be 2-3 blocks stale on fast chains) | |
| The contract's taskCreationBufferWindow (default: 2 blocks prod, 10 local) validates the block isn't too old. | |
| ───────────────────────────────────────────────── | |
| Part 2: Diagnosing What's Still Failing | |
| After PR #435, your Sepolia JSON shows blocks matching: aggregation_response.task_created_block: 10433576 == | |
| task.taskCreatedBlock: 10433576. The reference_block: 10433577 in the response is a separate informational field — it's just | |
| the current block at response time, used only for computing the expiration field. It does NOT affect BLS verification. | |
| For Base Sepolia: aggregation_response.task_created_block: 10433707 (Sepolia source block) vs task.taskCreatedBlock: | |
| 38781018 (Base block) — this is expected multichain behavior. Base uses BN254CertificateVerifier, not checkSignatures. | |
| If validateAttestationDirect is still reverting with matching blocks, the possible remaining causes are: | |
| 1. Stale operator registry cache — the AvsRegistryServiceChainCaller cache uses entry.block_num >= block_num (serves | |
| newer-block data when asked for a historical block). If an operator registered AFTER task_created_block, their key would be | |
| included in the cached state but absent from the contract's view at that block → InvalidQuorumApkHash. | |
| 2. The respondToTask background path is what's failing (not validateAttestationDirect). If the "unexpected block mismatch | |
| detected - updating indices" warning fires in DD logs, that means onchain_task.taskCreatedBlock differs from | |
| service_response.task_created_block somehow. But since the contract preserves the block we send (TaskLib.createTask returns | |
| the task unmodified), this should not happen. | |
| 3. Deployment hasn't fully propagated — the gateway instance might still be running pre-PR #435 code. | |
| To distinguish these, I need one piece of data: What error are you seeing? Is it: | |
| - (a) validateAttestationDirect reverts when the SDK/client calls it with the returned task + task_response + | |
| signature_data? | |
| - (b) The background respondToTask fails (check DD logs for "unexpected block mismatch")? | |
| - (c) signature_data: null in the response? | |
| - (d) Something else in the JSON that looks wrong? | |
| Part 3: The Fundamental Fix | |
| The root issue is architectural: five independent variables (task_created_block_u64, aggregator_block, source_chain_block, | |
| current_block_u64, and the block used for indices) must all agree, but they're computed separately with no enforcement. |
This is entirely a client-side error — there's no protocol-side code fix needed. Here's why:
- The deployed policy client (0x3212ef47) has the correct current ABI
- The gateway produces correct calldata (and PR #439 now exposes it as validate_calldata)
- The TaskManager, BLS aggregation, and attestation logic all work correctly
The only issue is the coworker's calling code was compiled against an old ABI. The protocol-side mitigation already exists:
PR #439's validate_calldata field — clients can use the gateway's pre-built calldata instead of constructing it themselves,
eliminating ABI version skew entirely.
Here's the Slack message:
Investigated the validateAttestationDirect revert on Sepolia (0x3212ef47). The eth_call / eth_estimateGas both fail with an
empty 0x revert and no error message.
Ran cast call --trace against the contract — only 624 gas consumed, zero subcalls. That means the call never reaches any
business logic (no require checks, no BLS verification, no TaskManager delegation). It reverts at the EVM function dispatch
level.
Checked the contract bytecode: the selector in your calldata (0x920d8da9) does not exist in the deployed contract. The
current validateAttestationDirect selector is 0xee762560. This changed when uint256 initializationTimestamp was added to
both Task and TaskResponse structs (commit 303656f6) — any struct change to function parameters produces a completely
different 4-byte selector.
Confirmed by:
- cast code 0x3212ef47 | grep 920d8da9 → not found
- cast code 0x3212ef47 | grep ee762560 → found
- forge inspect INewtonProverTaskManager methodIdentifiers → validateAttestationDirect: ee762560
- On-chain storage reads confirm the contract is properly configured (correct TaskManager, policyId, policy address)
Fix: recompile your client against the latest INewtonProverTaskManager ABI (the one with initializationTimestamp in both
Task and TaskResponse). Alternatively, if you're using the gateway's newt_createTask response, we now have an opt-in
validate_calldata field (PR #439) that returns pre-built ABI-encoded calldata — avoids this class of issue entirely.
Confirmed: ee762560 is the current selector. The investigation is complete.
★ Insight ─────────────────────────────────────
When Solidity structs used as function parameters change (even adding a single field), the keccak256 of the canonical
function signature changes, producing a completely different 4-byte selector. This is why adding uint256
initializationTimestamp to Task and TaskResponse silently broke all clients compiled against the old ABI — the function
"disappeared" from the contract's perspective.
─────────────────────────────────────────────────
Investigation Summary: NEWT-676/677 Sepolia Revert
Contract: 0x3212ef47BCDb2C88dA4065352e60ce7C8725fC0F (user-deployed policy client, NOT a Newton deployment)
Root Cause: The coworker's client is calling validateAttestationDirect with selector 0x920d8da9, which corresponds to an
older ABI version of the Task/TaskResponse structs. The deployed policy client was compiled with the current ABI (selector
0xee762560, which includes uint256 initializationTimestamp added in commit 303656f6). Since no function matches 0x920d8da9,
the EVM reverts immediately at the function dispatch level with zero error data — explaining the empty 0x revert, 624 gas
usage, and no subcalls.
What changed: Commit 303656f6 added uint256 initializationTimestamp to both Task and TaskResponse structs in
INewtonProverTaskManager.sol. This changes the canonical type signature of every function that accepts these structs,
producing new selectors.
Fix: The coworker must recompile their calling code (SDK, script, or frontend) against the latest INewtonProverTaskManager
ABI. The correct selector for validateAttestationDirect is 0xee762560.
Not the issue: BLS verification, block number mismatches, quorum, or on-chain state — the call never reaches any of those
checks because it reverts at the selector dispatch level.