Reth data storage

Comprehensive Reth Storage Architecture Report

   Based on my thorough exploration of the Reth codebase, here's a detailed understanding of the storage architecture:

   1. Database Technology: MDBX + Static Files (Hybrid Model)

   Reth implements a hybrid storage approach combining two technologies:

   MDBX (Memory-Mapped Database Exchange)

   - Location: /Users/williamcory/reth/crates/storage/libmdbx-rs/
   - Technology: MDBX is a high-performance embedded key-value database that uses memory-mapped I/O
   - Wrapper: Reth uses reth_libmdbx crate to provide Rust bindings
   - Configuration:
     - Supports both read-only (RO) and read-write (RW) modes
     - Uses DatabaseEnv as the main environment struct
     - Transactions (Tx<K>) provide ACID guarantees
     - All writes are transactional - changes cannot occur outside transactions

   Static Files (Immutable History)

   - Format: NippyJar (columnar, compressed format)
   - Location: /Users/williamcory/reth/crates/storage/nippy-jar/ and /Users/williamcory/reth/crates/static-file/
   - Purpose: Store immutable chain history data efficiently
   - Compression: Supports LZ4 and Zstd compression
   - Organization: Data organized into segments with configurable blocks per file (default 128 blocks)

   2. Complete Table Schema

   The database defines 27 main tables (from /Users/williamcory/reth/crates/storage/db-api/src/tables/mod.rs):

   Block Data Tables

   - CanonicalHeaders: BlockNumber → HeaderHash (canonical chain mapping)
   - Headers: BlockNumber → Header (full block headers)
   - HeaderNumbers: BlockHash → BlockNumber (reverse lookup)
   - HeaderTerminalDifficulties: BlockNumber → CompactU256 (deprecated, for historical blocks)
   - BlockBodyIndices: BlockNumber → StoredBlockBodyIndices (transaction ranges per block)
   - BlockOmmers: BlockNumber → StoredBlockOmmers (uncle/ommer blocks)
   - BlockWithdrawals: BlockNumber → StoredBlockWithdrawals (EIP-4895 withdrawals)

   Transaction Tables

   - Transactions: TxNumber → TransactionSigned (full transaction bodies)
   - TransactionHashNumbers: TxHash → TxNumber (reverse hash lookup)
   - TransactionBlocks: TxNumber → BlockNumber (maps highest tx in block)
   - TransactionSenders: TxNumber → Address (recovered signer, for execution optimization)
   - Receipts: TxNumber → Receipt (transaction receipts)

   State Tables (Current)

   - PlainAccountState: Address → Account (current account state)
   - PlainStorageState: Address → {SubKey: B256} → StorageEntry (current storage values, DupSort table)
   - Bytecodes: B256 → Bytecode (smart contract bytecodes)

   State Tables (Hashed - Merkle)

   - HashedAccounts: B256 (keccak256 address) → Account (for state root calculation)
   - HashedStorages: B256 (hashed address) → {SubKey: B256 (hashed key)} → StorageEntry (DupSort)

   Trie Tables

   - AccountsTrie: StoredNibbles → BranchNodeCompact (accounts trie nodes)
   - StoragesTrie: B256 (hashed address) → {SubKey: StoredNibblesSubKey} → StorageTrieEntry (DupSort)

   Change/History Tables (Time-travel queries)

   - AccountsHistory: ShardedKey → BlockNumberList (which blocks touched account)
   - StoragesHistory: StorageShardedKey → BlockNumberList (which blocks touched storage key)
   - AccountChangeSets: BlockNumber → {SubKey: Address} → AccountBeforeTx (DupSort, pre-execution state)
   - StorageChangeSets: BlockNumberAddress → {SubKey: B256} → StorageEntry (DupSort, pre-execution state)

   Trie Change Tables

   - AccountsTrieChangeSets: BlockNumber → {SubKey: StoredNibblesSubKey} → TrieChangeSetsEntry (DupSort)
   - StoragesTrieChangeSets: BlockNumberHashedAddress → {SubKey: StoredNibblesSubKey} → TrieChangeSetsEntry (DupSort)

   Metadata/Checkpoints Tables

   - StageCheckpoints: StageId (String) → StageCheckpoint (pipeline stage progress)
   - StageCheckpointProgresses: StageId → Vec (arbitrary stage data)
   - PruneCheckpoints: PruneSegment → PruneCheckpoint (pruning progress)
   - VersionHistory: u64 (unix timestamp) → ClientVersion (version tracking)
   - ChainState: ChainStateKey → BlockNumber (last finalized/safe block)
   - Metadata: String → Vec (generic key-value metadata)

   Key Table Characteristics:
   - 27 tables total: 24 standard tables + 3 DupSort tables
   - DupSort tables (one key, multiple values): AccountChangeSets, StorageChangeSets, PlainStorageState, etc.
   - Regular tables (one key, one value): Most tables

   3. Static File Segments

   Four main segments for immutable data (from /Users/williamcory/reth/crates/static-file/types/src/segment.rs):

   | Segment            | Tables Covered                                        | Format         | Columns |
   |--------------------|-------------------------------------------------------|----------------|---------|
   | Headers            | CanonicalHeaders, Headers, HeaderTerminalDifficulties | NippyJar (LZ4) | 3       |
   | Transactions       | Transactions                                          | NippyJar (LZ4) | 1       |
   | Receipts           | Receipts                                              | NippyJar (LZ4) | 1       |
   | TransactionSenders | TransactionSenders                                    | NippyJar (LZ4) | 1       |

   Naming Convention: static_file_{segment}_{start}_{end}_{filter}_{compression}
   Example: static_file_headers_0_127_none_lz4

   4. Hybrid Storage Model

   The architecture intelligently divides data responsibility:

   ┌─────────────────────────────────────┐
   │   StaticFileProvider (Read-only)    │
   │  ┌──────────────────────────────┐   │
   │  │ NippyJar Files (Immutable):  │   │
   │  │ - Headers (blocks 0-127)     │   │
   │  │ - Transactions (txs 0-5000)  │   │
   │  │ - Receipts                   │   │
   │  │ - Transaction Senders        │   │
   │  └──────────────────────────────┘   │
   └─────────────────────────────────────┘
                ↓ queries ↓
   ┌─────────────────────────────────────┐
   │   DatabaseProvider (MDBX)           │
   │  ┌──────────────────────────────┐   │
   │  │ MDBX Tables:                 │   │
   │  │ - Current state (hot)        │   │
   │  │ - Change sets (hot)          │   │
   │  │ - Metadata & checkpoints     │   │
   │  │ - Transaction indices        │   │
   │  └──────────────────────────────┘   │
   └─────────────────────────────────────┘

   Data Flow:
   1. Historical data (blocks 0-128, etc.) → Static files (immutable, compressed)
   2. Recent/hot data (last 128 blocks) → MDBX (mutable, fast access)
   3. Metadata (checkpoints, version history) → MDBX only
   4. State queries → Check static files first, then MDBX for recent data

   5. Key Data Organization Patterns

   Sharded Keys (History Optimization)

   For efficient time-travel queries without storing full history:
   ShardedKey<Address>: "Address | BlockNumber_Shard"
   StorageShardedKey: "Address | StorageKey | BlockNumber_Shard"

   The last shard contains u64::MAX as the block number, allowing binary search to find the correct historical state efficiently.

   DupSort Tables

   Tables with one key and multiple sub-values:
   - PlainStorageState: Address → Multiple (StorageKey, Value) pairs
   - AccountChangeSets: BlockNumber → Multiple (Address, State) pairs
   - StorageChangeSets: BlockNumberAddress → Multiple (Key, Value) pairs

   Compact Encoding

   - Codec: Custom Compact trait (not RLP)
   - Features:
     - Variable-length integer encoding
     - Bitflags for optional fields
     - Column-specific compression
     - Specialized encoding for fixed-size arrays (B256, Address)
   - Location: /Users/williamcory/reth/crates/storage/codecs/

   6. Pruning Segments

   The database supports selective data retention via PruneSegments:

   - SenderRecovery: Prune TransactionSenders table
   - TransactionLookup: Prune TransactionHashNumbers table
   - Receipts: Prune entire Receipts table
   - ContractLogs: Prune Receipts filtered by logs
   - AccountHistory: Prune AccountChangeSets & AccountsHistory
   - StorageHistory: Prune StorageChangeSets & StoragesHistory

   Each segment tracks its own checkpoint, allowing independent pruning strategies.

   7. Core Abstractions

   DatabaseEnv (MDBX)

   pub struct DatabaseEnv {
       inner: Environment,           // libmdbx environment
       dbis: Arc<HashMap<...>>,      // Cached table handles
       metrics: Option<Arc<...>>,    // Performance metrics
       _lock_file: Option<StorageLock>,  // RW lock
   }

   Transactions

   - DbTx: Read-only transaction (RO mode)
   - DbTxMut: Read-write transaction (RW mode)
   - Both provide cursor-based iteration
   - Cursors: Sorted, efficient iteration over table entries
   - Walkers: Abstractions for range-based walks

   DatabaseProvider (Main API)

   pub struct DatabaseProvider<TX, N: NodeTypes> {
       tx: TX,                          // DB transaction
       chain_spec: Arc<N::ChainSpec>,
       static_file_provider: StaticFileProvider<N::Primitives>,
       prune_modes: PruneModes,
       storage: Arc<N::Storage>,
       storage_settings: Arc<RwLock<StorageSettings>>,
       minimum_pruning_distance: u64,
   }

   Implements traits like:
   - HeaderProvider, BlockReader, TransactionsProvider
   - StateProvider, HistoricalStateProvider
   - BlockWriter, StateWriter, HistoryWriter

   8. Key Files and Entry Points

   Core Database:
   - /Users/williamcory/reth/crates/storage/db-api/src/tables/mod.rs - Table definitions
   - /Users/williamcory/reth/crates/storage/db/src/mdbx.rs - MDBX environment setup
   - /Users/williamcory/reth/crates/storage/db/src/implementation/mdbx/tx.rs - Transaction implementation

   Static Files:
   - /Users/williamcory/reth/crates/storage/provider/src/providers/static_file/manager.rs - StaticFileProvider
   - /Users/williamcory/reth/crates/storage/nippy-jar/src/lib.rs - NippyJar format

   Codecs:
   - /Users/williamcory/reth/crates/storage/codecs/src/lib.rs - Compact encoding
   - /Users/williamcory/reth/crates/storage/codecs/README.md - Codec documentation

   Provider API:
   - /Users/williamcory/reth/crates/storage/provider/src/providers/database/provider.rs - Main provider
   - /Users/williamcory/reth/crates/storage/storage-api/src/lib.rs - Trait definitions

   9. Performance Characteristics

   MDBX:
   - Memory-mapped I/O for speed
   - ACID transactions
   - Optimal for hot data (recent blocks, current state)
   - Multiple concurrent readers allowed
   - Single writer (RW lock)

   Static Files:
   - Immutable, columnar format
   - Compressed (LZ4/Zstd)
   - Efficient for historical data queries
   - No write overhead after creation
   - Fast sequential access via mmap

   Hybrid Benefits:
   - Recent data optimized for writes (MDBX)
   - Historical data optimized for space & reads (static files)
   - Automatic data migration from MDBX → static files
   - Supports efficient time-travel queries via change sets

   This architecture makes Reth highly efficient for both fast block production and historical state queries while keeping database size manageable through compression and
   pruning.

⎿ Done (32 tool uses · 86.3k tokens · 1m 25s)

⏺ Reth Data Storage Architecture

Reth uses a hybrid storage model combining two technologies:

Database Technologies

MDBX (Memory-Mapped Database)

High-performance embedded key-value store using memory-mapped I/O
Used for hot/recent data that changes frequently
Provides ACID transactions with multiple concurrent readers, single writer
Located in crates/storage/libmdbx-rs/

Static Files (NippyJar)

Columnar, compressed format (LZ4/Zstd) for immutable historical data
Once data is finalized (e.g., older than 128 blocks), it moves to static files
Much more space-efficient for archival data
Located in crates/storage/nippy-jar/

Data Shape: 27 Tables

The schema is defined in crates/storage/db-api/src/tables/mod.rs:

Block Data

Table	Key → Value
CanonicalHeaders	BlockNumber → HeaderHash
Headers	BlockNumber → Header
HeaderNumbers	BlockHash → BlockNumber
BlockBodyIndices	BlockNumber → StoredBlockBodyIndices
BlockWithdrawals	BlockNumber → StoredBlockWithdrawals

Transactions

Table	Key → Value
Transactions	TxNumber → TransactionSigned
TransactionHashNumbers	TxHash → TxNumber
TransactionBlocks	TxNumber → BlockNumber
TransactionSenders	TxNumber → Address
Receipts	TxNumber → Receipt

Current State

Table	Key → Value
PlainAccountState	Address → Account
PlainStorageState	Address → (StorageKey, Value)*
Bytecodes	B256 → Bytecode

Hashed State (for Merkle root computation)

Table	Key → Value
HashedAccounts	keccak256(Address) → Account
HashedStorages	keccak256(Address) → (keccak256(Key), Value)*

Trie Nodes

Table	Key → Value
AccountsTrie	Nibbles → BranchNodeCompact
StoragesTrie	HashedAddress → (Nibbles, TrieEntry)*

History/Change Sets (for time-travel queries)

Table	Key → Value
AccountsHistory	ShardedKeyAddress → BlockNumberList
StoragesHistory	StorageShardedKey → BlockNumberList
AccountChangeSets	BlockNumber → (Address, OldState)*
StorageChangeSets	BlockNumberAddress → (Key, OldValue)*

Metadata

Table	Key → Value
StageCheckpoints	StageId → StageCheckpoint
PruneCheckpoints	PruneSegment → PruneCheckpoint
VersionHistory	Timestamp → ClientVersion

Hybrid Model Flow

┌────────────────────────────────────┐ │ Static Files (Immutable) │ │ - Headers (blocks 0-N) │ │ - Transactions │ │ - Receipts │ │ - Transaction Senders │ │ (Compressed NippyJar files) │ └────────────────────────────────────┘ ↑ Data migrates after finalization │ ┌────────────────────────────────────┐ │ MDBX (Mutable/Hot Data) │ │ - Current account state │ │ - Recent blocks & transactions │ │ - Change sets for reorgs │ │ - Trie nodes │ │ - Metadata & checkpoints │ └────────────────────────────────────┘

Key insight: Historical data that's been finalized (won't change) gets moved to compressed static files, while recent/hot data stays in MDBX for fast reads and writes.

Key Design Patterns

Sharded Keys: History tables use sharded keys (Address | BlockShard) for efficient range queries
DupSort Tables: Tables like PlainStorageState store multiple values per key (one address → many storage slots)
Compact Encoding: Custom codec (not RLP) with variable-length integers and bitflags for space efficiency
Separate Hashed State: Maintains both plain addresses and keccak256-hashed addresses for efficient state root calculation

roninjin10/datastoragereth.md

Select an option

No results found

Select an option

No results found