Skip to content

Instantly share code, notes, and snippets.

@h0ffy
Last active January 27, 2026 18:08
Show Gist options
  • Select an option

  • Save h0ffy/aadb217f064c29c3d643206631873b87 to your computer and use it in GitHub Desktop.

Select an option

Save h0ffy/aadb217f064c29c3d643206631873b87 to your computer and use it in GitHub Desktop.

Llama-Cpp-Python

Instalación llama-cpp-python para CPU

pip install llama-cpp-python

Instalación llama-cpp-python CUBLAS para GPU

CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python

Instalación de llama-cpp-python (GGML)

CMAKE_ARGS="-DGGML_CUDA=on" FORCE_CMAKE=1 pip install --upgrade --force-reinstall llama-cpp-python --no-cache-dir

Instalación de llama-cpp-python (Server) (GGML)

CMAKE_ARGS="-DGGML_CUDA=on" FORCE_CMAKE=1 pip install 'llama-cpp-python[server]'
python3 -m llama_cpp.server --model models/7B/llama-model.gguf --n_gpu_layers 35
git clone --recurse-submodules https://github.com/abetlen/llama-cpp-python.git
cd llama-cpp-python

# Upgrade pip (required for editable mode)
pip install --upgrade pip

# Install with pip
pip install -e .

# if you want to use the fastapi / openapi server
pip install -e '.[server]'

# to install all optional dependencies
pip install -e '.[all]'

# to clear the local build cache
make clean

CTransformer

Instalación de ctransformer GPU

pip install ctransformers[cuda]

Instalación de ctransformer GPTQ ( Experimental )

pip install ctransformers[gptq]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment