Skip to content

Instantly share code, notes, and snippets.

@yiliu30
Created November 5, 2025 07:16
Show Gist options
  • Select an option

  • Save yiliu30/a7881cd1cbf0d676e3ffac3e3833aa8e to your computer and use it in GitHub Desktop.

Select an option

Save yiliu30/a7881cd1cbf0d676e3ffac3e3833aa8e to your computer and use it in GitHub Desktop.
#!/bin/bash
# Check if a model name is passed as an argument, otherwise use the default model path
if [ -z "$1" ]; then
model_path="Meta-Llama-3-8B-Instruct-W4A16-G128-AutoRound"
else
model_path="$1"
fi
tp_size=1
model_name=$(basename ${model_path})
output_dir="${model_name}-tp${tp_size}-gsm8k-acc"
task_name="gsm8k"
echo "Evaluating model: ${model_path} on task: ${task_name}, output dir: ${output_dir}"
# VLLM_ATTENTION_BACKEND=TRITON_ATTN \
mkdir -p ${output_dir}
VLLM_USE_DEEP_GEMM=0 \
VLLM_ATTENTION_BACKEND=FLASHINFER \
VLLM_LOGGING_LEVEL=DEBUG \
VLLM_ENABLE_V1_MULTIPROCESSING=1 \
lm_eval --model vllm \
--model_args "pretrained=${model_path},tensor_parallel_size=${tp_size},max_model_len=8192,max_num_batched_tokens=32768,max_num_seqs=128,add_bos_token=True,gpu_memory_utilization=0.8,dtype=bfloat16,max_gen_toks=2048,enable_prefix_caching=False" \
--tasks $task_name \
--batch_size 128 \
--log_samples \
--limit 1000 \
--seed 42 \
--output_path ${output_dir} \
--show_config 2>&1 | tee ${output_dir}/log.txt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment