Skip to content

Instantly share code, notes, and snippets.

@qingy1337
Created July 26, 2025 02:19
Show Gist options
  • Select an option

  • Save qingy1337/1d1188d64c532a8d3b8fbbcce089b280 to your computer and use it in GitHub Desktop.

Select an option

Save qingy1337/1d1188d64c532a8d3b8fbbcce089b280 to your computer and use it in GitHub Desktop.
Server setup
curl -LsSf https://astral.sh/uv/install.sh | sh && source $HOME/.local/bin/env
mkdir blackwell && cd blackwell
uv venv .venv --python=3.10 --seed
source .venv/bin/activate
uv pip install -U vllm --torch-backend=cu128 --extra-index-url https://wheels.vllm.ai/nightly
uv pip install unsloth unsloth_zoo bitsandbytes
uv pip install -U triton>=3.3.1 transformers
uv pip install huggingface_hub datasets deepspeed
uv pip install https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/download/v0.3.13/flash_attn-2.8.1+cu128torch2.7-cp310-cp310-linux_x86_64.whl
curl -sSL https://ngrok-agent.s3.amazonaws.com/ngrok.asc \
| sudo tee /etc/apt/trusted.gpg.d/ngrok.asc >/dev/null \
&& echo "deb https://ngrok-agent.s3.amazonaws.com bookworm main" \
| sudo tee /etc/apt/sources.list.d/ngrok.list \
&& sudo apt update \
&& sudo apt install ngrok
ngrok config add-authtoken $NGROK_TOKEN
sudo apt update
sudo apt install -y build-essential cmake git curl libcurl4-openssl-dev libomp-dev ccache cmake
sudo apt install git curl -y
cd ~/
git clone https://github.com/ggml-org/llama.cpp.git
cd llama.cpp
cmake -B build -DGGML_CUDA=ON
cmake --build build --config Release -j 32
cd build/bin
sudo chmod +x ./llama-cli
sudo chmod +x ./llama-server
huggingface-cli download qingy2024/Extrapolis-v1-4B-SFT llama-swap --local-dir .
sudo chmod +x ./llama-swap
cd ~/
echo 'export PATH="$PATH:$HOME/llama.cpp/build/bin"' >> ~/blackwell/.venv/bin/activate
echo 'source /home/ubuntu/blackwell/.venv/bin/activate' >> ~/.bashrc
source ~/.bashrc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment