This is my previous Local LLM Stack environment, which I wanted to share. It's built for machines running Windows and an NVIDIA GPU, leveraging Docker Compose for containerization.
The stack provides a fully portable, accelerated local AI environment using:
- Ollama: The runtime for pulling, serving, and managing local large language models (LLMs) using your NVIDIA GPU. π¦
- Open WebUI: A feature-rich, self-hosted web interface to interact with the models served by Ollama. π
- Caddy: A powerful reverse proxy that manages HTTPS for the entire stack. π
- Watchtower: Configured for automatic updates of all services. π
-
Drop & Go: Simply place the
docker-compose.ymlfile and yourCaddyfileinto an empty directory. -
Start: Run the following command in that directory:
docker compose up -d
Docker Compose will automatically create the necessary data folders (e.g.,
ollama-data,openwebui-data, etc.) on your host machine.
The environment will start and be accessible at https://localhost:3000.
- GPU Acceleration: Configured to automatically utilize your NVIDIA GPU for all model inference. β‘
- Portability: Uses local bind mounts (e.g.,
./ollama-data) for all data, making the configuration independent of the Docker project name and easily transferable between machines. π§³ - Automatic HTTPS: Caddy is set up to provide a basic HTTPS endpoint. You will need to modify the included
Caddyfileto configure your desired hostname or domain and manage the certificate trust. π οΈ - Auto-Updates: Watchtower is enabled on all core services to keep them up-to-date automatically. β¨
Disclaimer: This is an old personal stack shared as-is. Additional hardening, security, and network configuration are required for production or public use. I accept no responsibility whatsoever for its use, misuse, or any consequences resulting from running this configuration.
β οΈ