Last active
March 10, 2026 04:54
-
-
Save MoserMichael/8e6dd03acf910b7c4e8f1f51734a94d1 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # use brew on the mac, this installs ollama that has been compiled with the metal library | |
| # metal is a library that uses the native GPU on the M3 processor | |
| brew install ollama | |
| # check that metal is in use | |
| # if metal is installed correctly, you will see a line like this on the screen: | |
| # "inference compute" ... library=Metal ... | |
| OLLAMA_DEBUG=1 ollama serve | |
| # next thing we want to do: load the claude model | |
| # now stop ollama serve and run it again with the following option. | |
| # environment variable GODEBUG=x509negativeserial=1 tells the http go library to to ignore problems with certificates | |
| # (can occur due to some proxy on the way, when loading claude) | |
| # you can also add OLLAMA_DEBUG=1 if you want to see lots of debugging output: OLLAMA_DEBUG=1 GODEBUG=x509negativeserial=1 ollama serve | |
| GODEBUG=x509negativeserial=1 ollama serve | |
| # in a separate shell window, start pulling the claude model - this one takes quite a lot of time. | |
| # here ollama is talking to the locally running claude server | |
| ollama pull glm-4.7-flash | |
| # after loading the model: run the model now (talks to the claude instance hosted by of $(ollama serve) | |
| # this one asks for a prompt in the terminal, runs reasonably fast on the GPU of an M3 processor! | |
| olama run glm-4.7-flash | |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment