Skip to content

Instantly share code, notes, and snippets.

@cheeseonamonkey
Last active December 7, 2025 03:40
Show Gist options
  • Select an option

  • Save cheeseonamonkey/968b30310a2e2195a11afaf17c6882fc to your computer and use it in GitHub Desktop.

Select an option

Save cheeseonamonkey/968b30310a2e2195a11afaf17c6882fc to your computer and use it in GitHub Desktop.
transparent compression benchmarker (brtfs)
#!/usr/bin/env zsh
set -euo pipefail
SRC="/tmp/testfile.dat"
DEV="/dev/disk/by-id/scsi-0Linode_Volume_blk_vol"
MNT="/blk"
TMP="$MNT/.bench_tmp"
OPTS=("none" "lzo" "zlib:1" "zlib:5" "zlib:9" "zstd:1" "zstd:3" "zstd:6" "zstd:9" "zstd:15")
TOTAL="1500M"
PAT='\.(txt|json|log|xml|py|sh|md|csv|html|css|js|c|h|cpp|go|rs|java|rb|pl|yml|yaml|toml|ini|conf|cfg)$'
cleanup() {
umount -R "$MNT" 2>/dev/null || true
rm -rf "$MNT/test.dat" "$TMP" 2>/dev/null || true
}
trap cleanup EXIT
# ------------------------------------------------------------
# SUDO HOME FIX
# ------------------------------------------------------------
BASE_HOME="$HOME"
if [[ -n "${SUDO_USER:-}" && -d "/home/$SUDO_USER" ]]; then
BASE_HOME="/home/$SUDO_USER"
fi
collect() {
local dir=$1 count=$2
[[ -d "$dir" ]] && find "$dir" -type f 2>/dev/null | grep -iE "$PAT" | shuf | head -n "$count"
}
drop() {
sync
sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches' 2>/dev/null || true
}
# ------------------------------------------------------------
# BUILD DATASET
# ------------------------------------------------------------
if [[ ! -f "$SRC" || $(stat -c %s "$SRC") -lt 10000000 ]]; then
{
collect "$BASE_HOME/Data" 5000
collect "$BASE_HOME/Proj" 5000
collect "$BASE_HOME/Downloads" 4000
collect "/etc" 2000
} | tar -cf - -T - 2>/dev/null | head -c "$TOTAL" > "$SRC"
fi
orig=$(stat -c %s "$SRC")
orig_mib=$(awk "BEGIN{printf \"%.2f\",$orig/1024/1024}")
printf "%-10s %7s %8s %8s %10s %10s %10s %10s\n" \
"algo" "ratio" "c_time" "d_time" "wr_raw" "rd_raw" "wr_eff" "rd_eff"
# ------------------------------------------------------------
# MAIN LOOP
# ------------------------------------------------------------
for opt in "${OPTS[@]}"; do
ct=0
dt=0
comp="$orig"
umount -R "$MNT" 2>/dev/null || true
mkdir -p "$MNT"
mount -t btrfs -o noatime,ssd,discard=async "$DEV" "$MNT"
mkdir -p "$TMP"
cf="$TMP/compressed.bin"
case "$opt" in
none)
;;
lzo)
t0=$(date +%s.%N)
lzop -c < "$SRC" > "$cf"
ct=$(awk "BEGIN{print $(date +%s.%N)-$t0}")
comp=$(stat -c %s "$cf")
t0=$(date +%s.%N)
lzop -dc "$cf" > /dev/null
dt=$(awk "BEGIN{print $(date +%s.%N)-$t0}")
;;
zlib:*)
lvl=${opt#zlib:}
t0=$(date +%s.%N)
pigz -"${lvl}" -c < "$SRC" > "$cf"
ct=$(awk "BEGIN{print $(date +%s.%N)-$t0}")
comp=$(stat -c %s "$cf")
t0=$(date +%s.%N)
pigz -dc "$cf" > /dev/null
dt=$(awk "BEGIN{print $(date +%s.%N)-$t0}")
;;
zstd:*)
lvl=${opt#zstd:}
t0=$(date +%s.%N)
zstd -"${lvl}" -c < "$SRC" > "$cf"
ct=$(awk "BEGIN{print $(date +%s.%N)-$t0}")
comp=$(stat -c %s "$cf")
t0=$(date +%s.%N)
zstd -dc "$cf" > /dev/null
dt=$(awk "BEGIN{print $(date +%s.%N)-$t0}")
;;
esac
ratio=$(awk "BEGIN{printf \"%.2fx\",$orig/$comp}")
drop
t0=$(date +%s.%N)
wr_raw=$(dd if="$SRC" of="$MNT/test.dat" bs=1M oflag=direct 2>&1 | awk -F'[ ,]+' '/copied/ {for(i=1;i<=NF;i++) if($i=="MiB/s") print $(i-1)}')
wr_time=$(awk "BEGIN{print $(date +%s.%N)-$t0}")
drop
t0=$(date +%s.%N)
rd_raw=$(dd if="$MNT/test.dat" of=/dev/null bs=1M iflag=direct 2>&1 | awk -F'[ ,]+' '/copied/ {for(i=1;i<=NF;i++) if($i=="MiB/s") print $(i-1)}')
rd_time=$(awk "BEGIN{print $(date +%s.%N)-$t0}")
wr_eff=$(awk "BEGIN{printf \"%.0f\",$orig_mib/($ct+$wr_time)}")
rd_eff=$(awk "BEGIN{printf \"%.0f\",$orig_mib/($dt+$rd_time)}")
rm -f "$MNT/test.dat" "$cf"
umount -R "$MNT"
printf "%-10s %7s %7.1fs %7.1fs %7sMiB %7sMiB %7sMiB %7sMiB\n" \
"$opt" "$ratio" "$ct" "$dt" "$wr_raw" "$rd_raw" "$wr_eff" "$rd_eff"
done

This script stress-tests Btrfs compression modes using a real 1.5 GiB mixed file sample by mounting different file system configurations and testing disk I/O.

Each pass will remount the entire filesystem cold and drops caches, so results stay comparable.

It randomly samples bits from my ~ directory to create a realistic, mixed-entropy test dataset.

The only numbers that really matter are wr_eff and rd_eff — they reflect what you’ll feel in practice. It essentially shows where compression helps and where your CPU becomes the bottleneck.


Results:

❯ sudo ./a.zsh
algo         ratio   c_time   d_time     wr_raw     rd_raw     wr_eff     rd_eff
none         1.00x     0.0s     0.0s        MiB        MiB     580MiB     315MiB
lzo          4.80x     1.0s     0.7s        MiB        MiB     258MiB     219MiB
zlib:1       6.49x     3.1s     0.9s        MiB        MiB     116MiB     186MiB
zlib:5       8.00x     6.2s     0.8s        MiB        MiB      65MiB     207MiB
zlib:9       8.43x    29.8s     0.8s        MiB        MiB      15MiB     204MiB
zstd:1       7.94x     1.5s     0.4s        MiB        MiB     195MiB     241MiB
zstd:3       8.94x     1.9s     0.4s        MiB        MiB     171MiB     237MiB
zstd:6      10.30x     5.0s     0.4s        MiB        MiB      78MiB     249MiB
zstd:9      11.07x     7.0s     0.3s        MiB        MiB      58MiB     245MiB
zstd:15     11.75x    50.4s     0.3s        MiB        MiB       9MiB     254MiB
~                                                                                                                                   2m 15s root@localhost
❯

Columns:

  • ratio = space saved
  • c_time = compress time
  • d_time = decompress time
  • wr_eff = true write speed (CPU + disk)
  • rd_eff = true read speed (CPU + disk)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment