Created
June 1, 2025 12:11
-
-
Save AaronBeier/b4aa9f863831a05055def311c1b9eab4 to your computer and use it in GitHub Desktop.
llama.cpp benchmarks, AMD Ryzen 9 9900X, Intel Arc A380
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| llama.cpp b5466 | |
| intel-compute-runtime 25.18.33578.6 | |
| intel-media-driver 25.2.3 | |
| intel-oneapi-basekit 2025.0.1.46 | |
| vulkan-intel 1:25.1.1 | |
| openblas 0.3.29 | |
| gcc 15.1.1+r7+gf36ec88aa85a | |
| linux 6.14.9.arch1-1 | |
| model Unsloth Phi-4-Mini-Reasoning Q5_K_M | |
| common options --ctx-size 4096 --flash-attn --mlock --jinja | |
| first prompt How to solve 3*x^2+4*x+5=1? | |
| second prompt solve {\left( {z - 2} \right)^2} - 36 = 0 | |
| cpu only (--cache-type-k q8_0 --cache-type-v q8_0) | |
| prompt eval time = 231.52 ms / 34 tokens ( 6.81 ms per token, 146.85 tokens per second) | |
| eval time = 91585.48 ms / 1874 tokens ( 48.87 ms per token, 20.46 tokens per second) | |
| total time = 91817.00 ms / 1908 tokens | |
| prompt eval time = 3476.85 ms / 455 tokens ( 7.64 ms per token, 130.87 tokens per second) | |
| eval time = 61639.12 ms / 1256 tokens ( 49.08 ms per token, 20.38 tokens per second) | |
| total time = 65115.96 ms / 1711 tokens | |
| vulkan 10 layers | |
| prompt eval time = 1490.77 ms / 34 tokens ( 43.85 ms per token, 22.81 tokens per second) | |
| eval time = 81832.84 ms / 1134 tokens ( 72.16 ms per token, 13.86 tokens per second) | |
| total time = 83323.61 ms / 1168 tokens | |
| vulkan 20 layers | |
| prompt eval time = 1410.32 ms / 34 tokens ( 41.48 ms per token, 24.11 tokens per second) | |
| eval time = 136721.74 ms / 1454 tokens ( 94.03 ms per token, 10.63 tokens per second) | |
| total time = 138132.06 ms / 1488 tokens | |
| cpu openblas | |
| prompt eval time = 1146.41 ms / 34 tokens ( 33.72 ms per token, 29.66 tokens per second) | |
| eval time = 83908.54 ms / 1785 tokens ( 47.01 ms per token, 21.27 tokens per second) | |
| total time = 85054.95 ms / 1819 tokens | |
| prompt eval time = 2545.14 ms / 372 tokens ( 6.84 ms per token, 146.16 tokens per second) | |
| eval time = 81429.75 ms / 1599 tokens ( 50.93 ms per token, 19.64 tokens per second) | |
| total time = 83974.90 ms / 1971 tokens | |
| sycl 10 layers | |
| prompt eval time = 1934.28 ms / 34 tokens ( 56.89 ms per token, 17.58 tokens per second) | |
| eval time = 116342.05 ms / 1731 tokens ( 67.21 ms per token, 14.88 tokens per second) | |
| total time = 118276.33 ms / 1765 tokens | |
| prompt eval time = 65.88 ms / 1 tokens ( 65.88 ms per token, 15.18 tokens per second) | |
| eval time = 96242.63 ms / 1359 tokens ( 70.82 ms per token, 14.12 tokens per second) | |
| total time = 96308.51 ms / 1360 tokens |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment