Skip to content

Instantly share code, notes, and snippets.

@link89
Created February 8, 2025 02:07
Show Gist options
  • Select an option

  • Save link89/273a4708971a3a780eb1b2b5eb2ba968 to your computer and use it in GitHub Desktop.

Select an option

Save link89/273a4708971a3a780eb1b2b5eb2ba968 to your computer and use it in GitHub Desktop.
A quick test for troubleshooting CUDA enviornment issue
#/bin/bash
set -e
# Hello world test
cat <<EOF > hello.cu
#include <stdio.h>
__global__ void helloFromGPU(void) {
printf("Hello World from GPU!\n");
}
int main(void) {
printf("Hello World from CPU!\n");
helloFromGPU<<<1, 10>>>();
cudaDeviceSynchronize();
return 0;
}
EOF
# If this fail, possible root cause is CUDA environment
nvcc -o hello hello.cu
# If this fail, possible root cause is hardware issue
./hello
# Tensorflow test
export TF_CPP_MIN_LOG_LEVEL=0
python <<EOF
import os
import tensorflow as tf
libtf = tf.sysconfig.get_lib()
os.system(f"find {libtf} | grep libtensorflow | xargs ldd")
gpus = tf.config.experimental.list_physical_devices('GPU')
if not gpus:
print("No GPU found. TensorFlow is using the CPU.")
else:
for gpu in gpus:
details = tf.config.experimental.get_device_details(gpu)
print(f"GPU: {gpu}, Details: {details}")
EOF
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment