Skip to content

Instantly share code, notes, and snippets.

@miminashi
Created August 9, 2025 21:17
Show Gist options
  • Select an option

  • Save miminashi/b9db70c29fa0c7d6416b592252540c8f to your computer and use it in GitHub Desktop.

Select an option

Save miminashi/b9db70c29fa0c7d6416b592252540c8f to your computer and use it in GitHub Desktop.
ubuntu@mi25:~/llama.cpp (master) $ ./build/bin/llama-bench -p 0 -n 128,256,512 -m ~/.cache/llama.cpp/unsloth_gpt-oss-20b-GGUF_gpt-oss-20b-F16.gguf -m ~/.cache/llama.cpp/unsloth_Qwen3-30B-A3B-Instruct-2507-GGUF_Qwen3-30B-A3B-Instruct-2507-UD-Q8_K_XL.gguf -m ~/.cache/llama.cpp/unsloth_gemma-3-27b-it-GGUF_gemma-3-27b-it-UD-Q8_K_XL.gguf
ggml_cuda_init: GGML_CUDA_FORCE_MMQ: no
ggml_cuda_init: GGML_CUDA_FORCE_CUBLAS: no
ggml_cuda_init: found 4 ROCm devices:
Device 0: Radeon Instinct MI25, gfx900:xnack- (0x900), VMM: no, Wave Size: 64
Device 1: Radeon Instinct MI25, gfx900:xnack- (0x900), VMM: no, Wave Size: 64
Device 2: Radeon Instinct MI25, gfx900:xnack- (0x900), VMM: no, Wave Size: 64
Device 3: Radeon Instinct MI25, gfx900:xnack- (0x900), VMM: no, Wave Size: 64
| model | size | params | backend | ngl | test | t/s |
| ------------------------------ | ---------: | ---------: | ---------- | --: | --------------: | -------------------: |
| gpt-oss ?B F16 | 12.83 GiB | 20.91 B | ROCm | 99 | tg128 | 26.50 ± 0.03 |
| gpt-oss ?B F16 | 12.83 GiB | 20.91 B | ROCm | 99 | tg256 | 26.12 ± 0.03 |
| gpt-oss ?B F16 | 12.83 GiB | 20.91 B | ROCm | 99 | tg512 | 25.42 ± 0.03 |
| qwen3moe 30B.A3B Q8_0 | 33.51 GiB | 30.53 B | ROCm | 99 | tg128 | 28.81 ± 0.16 |
| qwen3moe 30B.A3B Q8_0 | 33.51 GiB | 30.53 B | ROCm | 99 | tg256 | 28.55 ± 0.05 |
| qwen3moe 30B.A3B Q8_0 | 33.51 GiB | 30.53 B | ROCm | 99 | tg512 | 27.67 ± 0.05 |
| gemma3 27B Q8_0 | 29.62 GiB | 27.01 B | ROCm | 99 | tg128 | 8.43 ± 0.01 |
| gemma3 27B Q8_0 | 29.62 GiB | 27.01 B | ROCm | 99 | tg256 | 8.40 ± 0.00 |
| gemma3 27B Q8_0 | 29.62 GiB | 27.01 B | ROCm | 99 | tg512 | 8.30 ± 0.01 |
build: 99acbc99 (6112)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment