The biggest, baddest GLM-4.6 running locally on Metal GPU
This guide covers everything from zero to serving GLM-4.6 via OpenAI-compatible API on your Mac Studio, plus uploading your merged model to Hugging Face.
| Familiarize yourself with this repository. | |
| We are creating a new album. You are acting as Rick Rubin. | |
| My two favorite albums and the inspirations for this album we are creating are: | |
| - Black Country, New Road - Ants from Up There | |
| - Songs: Ohia - The Magnolia Electric Co. | |
| The concept is knowing people deeply, the people who have been off and on through out our lives and/or are currently close to us. The lyrics are poetic and cryptic. |
| codestral:22b | |
| 0.20 seconds | |
| qwen3-coder:30b | |
| 0.21 seconds | |
| gemma2:27b | |
| 0.29 seconds | |
| command-r-plus:104b |
| Note: Click the "Raw" button for a wider view. | |
| Sorted by Tok/sec (descending). | |
| 🚀 Ollama LLM Benchmark Results | |
| ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━━━━━━┳━━━━━━┓ | |
| ┃ Model ┃ TTFT (s) ┃ TPOT (s) ┃ Tok/sec ┃ Params (B) ┃ Size (GB) ┃ VRAM (GB) ┃ Runs ┃ | |
| ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━━━━━━╇━━━━━━┩ | |
| │ tinyllama:1.1b │ 0.052 ±0.008 │ 0.0038 ±0.0001 │ 266.2 │ 1.1 │ 0.59 │ 0.71 │ 3 │ | |
| ├──────────────────────────────────────────────────┼───────────────┼────────────────┼─────────┼────────────┼───────────┼───────────┼──────┤ |