Why pruned nodes can skip witness downloads for assume-valid blocks
Two years ago I asked in BSE why (segregated) witness data is downloaded for assume-valid blocks in pruned mode. We don't validate these witnesses, and we delete them shortly after. Pieter Wuille explained that skipping witness downloads would require extending the set of consensus checks delegated to assume-valid. But implementing this change is relatively straightforward, and because SegWit witness data now makes up a significant share of block bytes, omitting it can cut bandwidth by ~34% over the full chain (and 48-54% in recent years).
Updated: 2025-10-02 (snapshot at height 917,000).
We measured the share of witness bytes (size - strippedsize) across the full chain and recent windows. At this snapshot the full chain is ~690 GB, of which ~232 GB are witnesses—more than a third of all bytes (33.7%).
| Range | Witness Share |
|---|---|
| Last 100k blocks | 54.17% |
| Last 200k blocks | 53.22% |
| Last 300k blocks | 48.06% |
| Full chain (height 917,000) | 33.70% |
More interestingly, over the last 4 years, witnesses account for more than half of all bytes (driven mainly by P2TR adoption and inscription traffic since early 2023). If this trend persists, when the chain reaches 1 TB the cumulative witness share will be ~40% (400 GB).
If I don't miss anything, the list of checks that assume-valid should now cover is:
- Witness coinbase commitment (
wtxid) - Witness size and sigops limits, including block weight
- No witness data allowed in legacy inputs
This doesn't seem like a big deal compared to the full script validation we already outsource to assume-valid. If assume-valid covered these checks as well, then we could directly omit witness downloads for blocks within the assumed range.
An important thing to note is that all current consensus checks automatically pass even if we don't have the witnesses, because SegWit was a backwards-compatible soft fork that only constrained the rules. We don't even need to disable the wtxid commitment check, as it's optional when a block doesn't contain witnesses.
This idea led to a Bitcoin Core PR, with several conceptual ACKs but also some concerns about a security reduction and the loss of data availability. Below I will try to properly reason about this change and address any concerns.
First of all, let's see what are the bad scenarios that can, in theory, happen in both the current assume-valid mode and in the witness-skipping one.
The bad scenario in the current assume-valid mode is that developers, reviewers and maintainers lie to us and make us sync to an invalid chain. But we would need a majority of hashpower mining on that invalid chain as well.
A new bad scenario that can happen if we skip witness downloads for assume-valid blocks is that we sync to a chain where some witnesses are not available (and so no kind of node can sync except for us). We have two options:
- A case where developers, reviewers and maintainers lied to us, because the witnesses were not available when they validated the block. This likely means we also need a majority of hashpower mining blocks that lack witnesses and building on top of them.
- A case where this lack of availability appeared after the
assume-validhash was audited as valid (they didn't lie at the time).
In Bitcoin Core, it is accepted that assume-valid does not change the trust assumptions of the software. In other words, users of Bitcoin Core can, without added risks, assume the scenario A won't happen. This is based on the fact that scripts are widely available for independent audit, and developers themselves are already trusted to not do bad things that reviewers and external auditors don't detect.
Now, it is easy to see that stating the validity of all scripts implies all scripts have been available to you (otherwise how do you even know they are valid if not available?). In fact, missing witnesses means the script evaluations fail, which assume-valid already covers.
Hence, the scenario B.1 is not a concern as it is an implied premise of assume-valid. In other words, we are already trusting developers, reviewers and maintainers to check witness data availability after the block was mined.
Thus, the only new concern is the scenario B.2, that is, the case where witness data availability is lost after the block was mined and validated by auditors. Losing data availability means we no longer have any reachable archive node (which by definition retains all witness data). But in that case we won't be able to sync, as we still require all legacy block data and all witnesses after the assume-valid range.
The new behavior only manifests when no archive nodes are available, but there are peers serving all legacy block data and all witnesses beyond the assume-valid block. In this scenario, our node will fully sync with the Bitcoin network, whereas other node types will stall during IBD. This is the only behavior that would change, and only under these circumstances.
We could think that in this situation our node shouldn't be able to sync, just like the rest of the nodes. But the outcome of syncing is just reaching the same state as the whole Bitcoin network, which in this scenario consists entirely of pruned nodes, and "archive nodes" with missing witnesses or pre-SegWit.
The synced pruned nodes will remain fully operational, unless they're told to reindex or switch to archive mode. These pruned nodes only checked data availability once, during their IBD, and they can run indefinitely without the need for re-downloading old data. If one-time availability checks are acceptable for regular pruned nodes, they're equally acceptable for a witness-skipping pruned node—both rely on the fact that those witnesses were available at an earlier point, verified locally or via the Open Source process.
Then, if we can skip the script evaluations because they are attested by developers, reviewers and maintainers, we can also skip the one-time availability check for witnesses, as it is implicitly verified by the same group of people.
If you think we must re-check witness availability during IBD because assume-valid only guarantees availability at a past moment, then by that logic pruned nodes would also have to rescan the entire chain periodically. But since regular pruned nodes don't re-check data availability, there's no reason our specific kind of pruned node should either.
And anyway, if this catastrophic availability failure really happens, then the root of the problem is the lack of archive nodes, not the proliferation of pruned ones. Moreover, asking for the witness data periodically only helps identify the problem, but this problem will anyway be discovered when anyone tries to run any other kind of full node.
Witnessless Sync doesn't reduce security any more than a long-running pruned node. By extending the assume-valid shortcut to cover witness commitments, size and sigops limits, and the prohibition of witness data in legacy inputs, we are outsourcing checks no more critical than the full script validation we already trust. Just as pruned nodes perform a one-time data-availability check at IBD—sufficient for the node to run indefinitely—we rely on the fact that this one-time check was already made when each block's scripts were validated by the assume-valid auditors.
In every practical scenario, then, skipping witness retrieval for assume-valid blocks cuts bandwidth without making our node or the network more vulnerable to data availability problems than it already is.
Thanks for the answer @RubenSomsen!
I don't think this is true for
assume-valid.assume-validyou trust that the scripts are valid.assume-valid.That's the whole point of this post: downloading the witnesses in this case means we are not checking availability once, but twice.
assume-validalready covers validity and past availability.