You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Date: 2026-02-23
Platform: Apple Silicon (ARM64), 8 cores, L1D 64 KiB, L2 4096 KiB
Build: CMake Release, static libs
Repetitions: 5 (median CPU time reported)
crc32/armv8_pmull_eor3 (CRC32 only)
Size
develop (ns)
feature (ns)
Change
1
2.15
2.15
0%
8
5.67
4.31
-24.0%
12
5.66
5.34
-5.7%
16
5.91
5.83
-1.4%
32
6.35
6.28
-1.1%
64
8.70
8.69
0%
512
35.5
35.9
+1.1%
4096
100
102
+2.0%
32768
399
398
-0.3%
262144
2664
2717
+2.0%
4194304
41708
42192
+1.2%
crc32_copy/armv8_pmull_eor3 (CRC32 + memcpy)
Size
develop (ns)
feature (ns)
Change
32
10.3
6.48
-37.1%
512
39.4
38.4
-2.5%
8192
251
199
-20.7%
32768
700
630
-10.0%
65536
1480
1179
-20.3%
Summary
crc32 (no copy): No significant regression. Small-size improvement at 8 bytes; larger sizes within noise (~1-2%).
crc32_copy (interleaved copy): Substantial improvements across all sizes — 20-37% faster at most sizes. The interleaved CRC32+copy implementation avoids a separate memcpy pass.
Commits on improvements/crc32-arm-copy
b4043c6f Implement crc32 interleaved copy for ARM PMULL+EOR3
babbd9f1 Add ARM CRC32 private header with shared align/tail helpers
Raw benchmark output
develop (54352daf)
improvements/crc32-arm-copy (b4043c6f)