Created
July 24, 2025 11:07
-
-
Save AaronBeier/bb803d399b42177cd59bf1c40782fa8c to your computer and use it in GitHub Desktop.
Comparing llama.cpp vs llama.cpp + AMD's BLIS fork
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| llama.cpp b5970 | |
| blis 837d3974d43eaa84bb8758e4b80385b4150306b2 | |
| gcc 15.1.1+r7+gf36ec88aa85a | |
| linux 6.15.7.arch1-1 | |
| llama-bench --model Qwen3-Embedding-8B-Q5_K_M.gguf --embeddings 1 --prio 2 --threads 12 | |
| Default build: | |
| | model | size | params | backend | threads | embd | test | t/s | | |
| | ------------------------------ | ---------: | ---------: | ---------- | ------: | ---------: | --------------: | -------------------: | | |
| | qwen3 8B Q5_K - Medium | 5.04 GiB | 7.57 B | CPU | 12 | 1 | pp512 | 68.82 ± 0.22 | | |
| | qwen3 8B Q5_K - Medium | 5.04 GiB | 7.57 B | CPU | 12 | 1 | tg128 | 12.86 ± 0.00 | | |
| AMD's BLIS fork: | |
| | model | size | params | backend | threads | embd | test | t/s | | |
| | ------------------------------ | ---------: | ---------: | ---------- | ------: | ---------: | --------------: | -------------------: | | |
| | qwen3 8B Q5_K - Medium | 5.04 GiB | 7.57 B | BLAS | 12 | 1 | pp512 | 87.01 ± 0.38 | | |
| | qwen3 8B Q5_K - Medium | 5.04 GiB | 7.57 B | BLAS | 12 | 1 | tg128 | 12.87 ± 0.00 | | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment