Skip to content

Instantly share code, notes, and snippets.

@nmoinvaz
Created March 10, 2026 19:08
Show Gist options
  • Select an option

  • Save nmoinvaz/f6b841e66a3c706ecf50263430d25dd3 to your computer and use it in GitHub Desktop.

Select an option

Save nmoinvaz/f6b841e66a3c706ecf50263430d25dd3 to your computer and use it in GitHub Desktop.
zlib-ng: inflate_fast safe mode benchmark — small output buffer performance

zlib-ng: inflate_fast safe mode benchmark results

Summary

Adding a safe_mode parameter to inflate_fast() allows the fast path to run with as few as 3 bytes of avail_out (down from 260). This eliminates the performance cliff where PNG-style row-by-row decompression falls back to the slow inflate() state-machine path for the last 260 bytes of each row.

Related: zlib-ng/zlib-ng#2062

Machine

Spec Value
CPU Apple M3
RAM 24 GB
Arch arm64
OS macOS 15.7.4
Compiler Apple Clang (default)
Build Release, static

inflate_small_bench (small output buffers)

Simulates PNG-style row-by-row decompression with constrained avail_out. 256 KB of compressible data decompressed in fixed-size chunks.

avail_out Baseline (ns) Contender (ns) Change
64 143,288 118,668 -17.2%
128 100,689 79,391 -21.2%
256 80,936 55,975 -30.8%
512 58,234 47,555 -18.3%
1024 45,580 40,797 -10.5%
2048 39,171 36,858 -5.9%
4096 36,570 35,171 -3.8%
16384 34,097 33,515 -1.7%

CPU mean times, 5 repetitions each.

inflate_nocrc (regression check)

Standard inflate benchmark with large output buffers to verify no regression on the normal (non-safe) code path.

Input size Baseline (ns) Contender (ns) Change
1 19.2 19.2 +0.1%
64 134 136 +1.2%
1,024 294 291 -0.9%
16,384 3,813 3,827 +0.4%
131,072 15,036 15,077 +0.3%
1,048,576 105,320 106,299 +0.9%

CPU mean times, 5 repetitions each. All within noise — no regression.

Conclusion

Small output buffers (64–512 bytes, typical PNG row sizes) see -17% to -31% improvement. The improvement diminishes as avail_out grows, since larger buffers already spend most of their time in the fast path. No regression observed on standard large-buffer inflate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment