Skip to content

Instantly share code, notes, and snippets.

@AmgadHasan
Created August 29, 2025 12:09
Show Gist options
  • Select an option

  • Save AmgadHasan/bfcbc5d48dc33cd83bf6db36abd616aa to your computer and use it in GitHub Desktop.

Select an option

Save AmgadHasan/bfcbc5d48dc33cd83bf6db36abd616aa to your computer and use it in GitHub Desktop.
This shell scrit starts a docker container than runs Llama.cpp server with a web ui on the local machine and uses a CUDA GPU
docker run -p 8000:8000 --gpus all ghcr.io/ggml-org/llama.cpp:server-cuda -hf ggml-org/gemma-3-4b-it-GGUF --port 8000 --host 0.0.0.0 -n 512 --n-gpu-layers 99
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment