Probe Clusters in str0m

This document outlines a plan for implementing probe clusters in str0m, matching libWebRTC's bandwidth probing approach.

Scope

This plan covers probe clusters for existing media RTX/padding only. Pre-media SSRC 0 probing is out of scope and will be addressed in a separate PR that builds on this infrastructure.

Current State

str0m currently has basic padding support:

Pacer sets padding_rate based on min(estimate, desired_bitrate)
Padding is sent as RTX packets (retransmitting old packets or pure padding)
No concept of probe clusters - padding is continuous as long as pacer requests it
No coordination with TWCC feedback for probe evaluation

libWebRTC's Probe Cluster Approach

Key Concepts

A probe cluster is a short burst of packets sent at a target bitrate to measure available bandwidth. Key properties:

Parameter	Default Value	Purpose
`min_probe_duration`	15ms	Minimum time to sustain probe rate
`min_probe_packets_sent`	5	Minimum packets per cluster
`min_probe_delta`	2ms	Minimum time between probe packets
`max_probe_bitrate`	5 Mbit/s	Cap on probe rate
`max_waiting_time`	1 second	Wait for TWCC feedback before next probe

Probe Cluster Lifecycle

1. Create probe cluster with target bitrate
2. Transition to Active state when media packet is sent
3. Send packets at target rate for min_probe_duration
4. Complete cluster when both conditions met:
   - sent_bytes >= min_bytes (rate × duration)
   - sent_packets >= min_probe_packets_sent
5. Wait for TWCC feedback (up to 1 second)
6. Evaluate results and potentially create next probe cluster

Probing States

┌──────────┐    create cluster    ┌──────────┐
│ Disabled │ ─────────────────────► Inactive │
└──────────┘                      └────┬─────┘
                                       │
                             media sent│
                                       ▼
                                  ┌─────────┐
                                  │  Active │
                                  └────┬────┘
                                       │
                         cluster done  │
                                       ▼
                                  ┌──────────┐
                                  │ Inactive │ (wait for feedback)
                                  └──────────┘

When Probes Are Created

libWebRTC creates probe clusters at specific triggers:

Initial probing - On first media packet (exponential: 3x and 6x initial estimate)
Allocation change - When max allocated bitrate increases
ALR exit - When exiting Application Limited Region
Recovery - After significant bitrate drop
Network state - Based on network state estimator hints

Cluster ID and TWCC Correlation

libWebRTC assigns each probe cluster a unique ID. This ID is NOT sent in the RTP packets themselves. Instead:

The PacedPacketInfo struct carries the cluster ID through the pacing/sending chain
When a packet is sent, its TWCC sequence number is associated with the cluster ID internally
When TWCC feedback arrives, the received sequence numbers are matched to their cluster IDs
This allows the ProbeBitrateEstimator to calculate bitrate estimates per cluster

Sending side:
┌─────────────┐    PacedPacketInfo     ┌─────────────┐
│ ProbeCluster│ ──────────────────────► RTP Packet  │
│  id: 42     │    (cluster_id: 42)    │ twcc_seq: 5 │
└─────────────┘                        └─────────────┘
                                              │
Internal tracking:                            │
  twcc_seq 5 → cluster 42                     │
  twcc_seq 6 → cluster 42                     ▼
  twcc_seq 7 → cluster 42              ┌─────────────┐
                                       │TWCC Feedback│
Feedback arrives:                      │ seq 5,6,7   │
  Match seq 5,6,7 to cluster 42        └─────────────┘
  Calculate probe bitrate for cluster 42

Application Limited Region (ALR)

ALR (Application Limited Region) is when the sending application isn't requesting enough bandwidth to fill the available network capacity. This is NOT about whether we can send padding - it's about application demand.

Key insight: Padding is driven by desired_bitrate:

padding_rate = min(estimate, desired_bitrate) - current_bitrate

Even with a full RTX cache, padding only happens when the application asks for more bandwidth than it's currently using. If desired_bitrate is low, no padding is requested.

ALR examples:

Video paused → no new frames, low desired_bitrate
Screen share with static content → very few frames needed
Application reduces quality tier → lower desired_bitrate
Audio-only period in video call

In ALR state:

Send queue is consistently empty
Sending well below estimated capacity
But application hasn't requested more bandwidth
BWE estimates become stale (no congestion signal)

Why probe when exiting ALR?

When exiting ALR (e.g., video resumes), the application suddenly wants high bitrate again, but the BWE estimate is stale from the idle period. Probing rediscovers actual network capacity.

Detection methods:

Track if send queue is consistently empty
Compare actual send rate to estimated capacity
Check if desired_bitrate < estimate for extended period

Proposed Design for str0m

New Types

/// Configuration for probe clusters
pub struct ProbeClusterConfig {
    /// Target bitrate for this probe
    pub target_rate: Bitrate,
    /// Minimum duration to sustain probe
    pub min_duration: Duration,
    /// Minimum packets to send
    pub min_packets: usize,
    /// Minimum time between packets
    pub min_delta: Duration,
    /// Unique cluster ID for TWCC correlation
    pub id: u32,
}

/// State of a probe cluster
pub struct ProbeCluster {
    config: ProbeClusterConfig,
    created_at: Instant,
    started_at: Option<Instant>,
    sent_bytes: usize,
    sent_packets: usize,
}

/// Probe cluster manager
pub struct BitrateProber {
    state: ProbingState,
    clusters: VecDeque<ProbeCluster>,
    next_probe_time: Instant,
    next_cluster_id: u32,
    config: BitrateProberConfig,
}

pub struct BitrateProberConfig {
    pub max_probe_bitrate: Bitrate,           // 5 Mbit/s
    pub min_probe_duration: Duration,          // 15ms
    pub min_probe_packets: usize,              // 5
    pub min_probe_delta: Duration,             // 2ms
    pub max_probe_delay: Duration,             // 10ms (discard if delayed)
    pub cluster_timeout: Duration,             // 5 seconds
}

Integration Points

1. Pacer Changes

The pacer needs to understand probe clusters:

impl Pacer {
    /// Check if we should send a probe packet now
    fn should_send_probe(&self, now: Instant) -> Option<&ProbeCluster> {
        self.prober.current_cluster(now)
    }
    
    /// Register that a probe packet was sent
    fn probe_sent(&mut self, now: Instant, size: DataSize) {
        self.prober.probe_sent(now, size);
    }
    
    /// Get recommended probe packet size
    fn recommended_probe_size(&self) -> DataSize {
        self.prober.recommended_min_probe_size()
    }
}

2. Probe Controller

A new component to decide WHEN to probe:

pub struct ProbeController {
    config: ProbeControllerConfig,
    
    // State
    max_bitrate: Bitrate,
    estimated_bitrate: Option<Bitrate>,
    last_probe_time: Instant,
    
    // Triggers
    in_alr: bool,
    allocation_changed: bool,
}

impl ProbeController {
    /// Called on BWE estimate update
    fn on_estimate(&mut self, estimate: Bitrate) -> Vec<ProbeClusterConfig>;
    
    /// Called when max allocated bitrate changes  
    fn on_max_allocation(&mut self, rate: Bitrate) -> Vec<ProbeClusterConfig>;
    
    /// Called periodically
    fn process(&mut self, now: Instant) -> Vec<ProbeClusterConfig>;
}

3. Session Integration

impl Session {
    fn handle_timeout(&mut self, now: Instant) {
        // Check for new probe clusters
        if let Some(bwe) = &mut self.bwe {
            let new_clusters = bwe.probe_controller.process(now);
            for cluster in new_clusters {
                self.pacer.create_probe_cluster(cluster);
            }
        }
        
        // Existing pacer/queue logic...
    }
}

Padding Mechanism (RTX-based)

str0m already prefers RTX payload padding over pure padding. In StreamTx::poll_packet_padding():

// From src/streams/send.rs
if self.padding > MIN_SPURIOUS_PADDING_SIZE {  // 50 bytes
    // Try to find a cached packet to retransmit as padding
    if let Some(pkt) = self.rtx_cache.get_cached_packet_smaller_than(max_size) {
        // Use RTX retransmission (larger, more efficient)
        return Some(NextPacket { kind: NextPacketKind::Resend(...) });
    }
}
// Fall back to blank padding (max 240 bytes)
return Some(NextPacket { kind: NextPacketKind::Blank(...) });

This means:

For padding requests > 50 bytes: Try RTX first (can be ~1200 bytes)
Fall back to pure padding only if no suitable cached packet exists
Pure padding limited to 240 bytes per packet

Probe clusters will use this existing mechanism - they just control WHEN and HOW MUCH padding is requested.

Implementation Plan

Phase 1: BitrateProber (Core)

Create src/packet/prober.rs with:
- ProbeCluster struct
- BitrateProber struct
- Probe state machine (Disabled → Inactive → Active)
- create_cluster(), current_cluster(), probe_sent()
Unit tests for:
- Cluster lifecycle
- State transitions
- Timing constraints

Phase 2: Pacer Integration

Add BitrateProber to Pacer
Modify padding generation to respect probe cluster timing
Track probe packet sends via probe_sent()
Handle recommended_probe_size()

Phase 3: ProbeController

Create src/packet/probe_controller.rs
Implement probe triggers:
- Initial exponential probing (on first media)
- Allocation-based probing
Wire into BWE feedback loop

Phase 4: Configuration

Add probe cluster settings to RtcConfig
Expose key tunables:
- max_probe_bitrate
- min_probe_duration
- Enable/disable probing

Future Work (Out of Scope)

Pre-media SSRC 0 probing: Will build on this infrastructure in a separate PR
ALR detection: Application Limited Region detection for smarter probing
Network state estimator: Additional probing hints

Testing Strategy

Unit Tests

Probe cluster state machine
Timing constraints (min_delta, min_duration)
Cluster completion conditions
Multiple pending clusters

Integration Tests

Probe clusters generate expected packet counts
TWCC feedback correlates with probe cluster IDs
Probing respects bitrate caps
Probing triggers on allocation changes

Performance Tests

CPU usage at max probe rate
Memory usage with multiple pending clusters
Latency impact during probing

References

libWebRTC modules/pacing/bitrate_prober.cc
libWebRTC modules/congestion_controller/goog_cc/probe_controller.cc
WebRTC Hacks: Bandwidth Probing

algesten/str0m-probe-clusters.md

Select an option

No results found