Skip to content

Instantly share code, notes, and snippets.

View nmoinvaz's full-sized avatar

Nathan Moinvaziri nmoinvaz

  • Phoenix, United States
View GitHub Profile
@nmoinvaz
nmoinvaz / quick-bench-count-matches.cc
Last active January 26, 2026 04:06
Benchmark count matching bytes
#include <benchmark/benchmark.h>
#include <cstdint>
static inline uint32_t count_matching_bytes_ctzll(uint64_t mask) {
return __builtin_ctzll(mask);
}
static inline uint32_t count_matching_bytes_ctz32(uint64_t mask) {
uint32_t lo = (uint32_t)mask;
if (lo)
@nmoinvaz
nmoinvaz / benchmark_crc32_tail_copy.cc
Last active January 26, 2026 03:26
Benchmark zlib crc32 tail copy
/* benchmark_crc32_tail_copy.cc -- benchmark different copy strategies for CRC32 tail handling
* Copyright (C) 2022 Nathan Moinvaziri
* For conditions of distribution and use, see copyright notice in zlib.h
*/
#include <benchmark/benchmark.h>
#include <cstring>
#include <cstdint>
extern "C" {
@nmoinvaz
nmoinvaz / benchmark_tailcopy.cc
Last active January 26, 2026 02:14
Benchmark memory tail copying
/* benchmark_tailcopy.cc -- benchmark different copy strategies for tail handling
* Copyright (C) 2022 Nathan Moinvaziri
* For conditions of distribution and use, see copyright notice in zlib.h
*/
#include <benchmark/benchmark.h>
#include <cstring>
#include <cstdint>
extern "C" {
@nmoinvaz
nmoinvaz / zlib-ng-2106-pr.md
Created January 17, 2026 01:50
Zlib-ng PR #2106 Benchmarks
OS: Darwin 24.6.0 Darwin Kernel Version 24.6.0: Wed Nov  5 21:28:03 PST 2025; root:xnu-11417.140.69.705.2~1/RELEASE_ARM64_T8122 arm64
CPU: arm
Timing: Python perf_counter 
Levels: 0-9       
Runs: 70         Trim worst: 40        

Test 1

Develop

@nmoinvaz
nmoinvaz / launch.json
Last active January 8, 2026 23:16
zlib-ng launch.json
{
// Use IntelliSense to learn about possible attributes.
// Hover to view descriptions of existing attributes.
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
"name": "fuzzer_example_small",
"type": "lldb",
"request": "launch",
// Polynomial for CRC32 (IEEE 802.3): 0x1EDC6F41
// Intel whitepaper uses reflected polynomial: 0x82F63B78
// We'll use 0x1EDC6F41 for compatibility with zlib-ng
Z_INTERNAL Z_TARGET_CRC uint32_t crc32_armv8_pmull(uint32_t crc, const uint8_t *buf, size_t len) {
uint32_t c = ~crc;
// Constants for PMULL folding (from Intel whitepaper)
const uint64x2_t k1 = {0x0154442bd4ULL, 0x00000001ULL};
@nmoinvaz
nmoinvaz / crc32_armv8_pmull_single_lane.c
Created December 9, 2025 20:34
crc32_armv8_pmull_single_lane
Z_INTERNAL Z_TARGET_PMULL uint32_t crc32_armv8_pmull_single_lane(uint32_t crc, const uint8_t *buf, size_t len) {
uint32_t crc0 = ~crc;
/* 1. Alignment (Scalar) */
for (; len && ((uintptr_t)buf & 7); --len) {
crc0 = __crc32b(crc0, *buf++);
}
/* 2. Alignment to 16-byte boundary (8-byte scalar CRC) */
if (((uintptr_t)buf & 8) && len >= 8) {
@nmoinvaz
nmoinvaz / arm_cpu_info.c
Created December 8, 2025 01:55
ARM fast pmull detection
/* arm_cpu_id.c -- ARM CPU identification for microarchitecture detection
* Copyright (C) 2025 Nathan Moinvaziri
* For conditions of distribution and use, see copyright notice in zlib.h
*/
#include "zbuild.h"
#include "arm_cpu_id.h"
#if defined(__linux__)
# include <stdio.h>
@nmoinvaz
nmoinvaz / dougallj-benchmarks.md
Last active December 1, 2025 22:20
Benchmarks for zlib-ng issue #1998
@nmoinvaz
nmoinvaz / insert_string_distributions.md
Last active August 23, 2025 00:11
insert_string distributions

silesia.tar level 6

Total calls: 16533400
Count	Frequency	Percentage
3	4314962		26.10%
4	2598912		15.72%
5	2051025		12.41%
6	1613384		9.76%
7	1142336		6.91%
8	571554		3.46%