Skip to content

Instantly share code, notes, and snippets.

@initcron
Created November 18, 2025 07:58
Show Gist options
  • Select an option

  • Save initcron/6cff65d5d98a7ccfb537957779f7cb03 to your computer and use it in GitHub Desktop.

Select an option

Save initcron/6cff65d5d98a7ccfb537957779f7cb03 to your computer and use it in GitHub Desktop.
Dockerfile for vLLM with CPU only Serving
FROM openeuler/vllm-cpu:0.9.1-oe2403lts
# Patch the cpu_worker.py to handle zero NUMA nodes
RUN sed -i 's/cpu_count_per_numa = cpu_count \/\/ numa_size/cpu_count_per_numa = cpu_count \/\/ numa_size if numa_size > 0 else cpu_count/g' \
/workspace/vllm/vllm/worker/cpu_worker.py
ENV VLLM_TARGET_DEVICE=cpu \
VLLM_CPU_KVCACHE_SPACE=1 \
OMP_NUM_THREADS=2 \
OPENBLAS_NUM_THREADS=1 \
MKL_NUM_THREADS=1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment