Install CUDA deps:
sudo apt-get update
sudo apt-get install libcudnn9-dev-cuda-13
sudo apt-get install libblas-dev liblapack-dev liblapacke-dev
sudo apt-get install libnccl2 libnccl-devInstall MLX:
CMAKE_ARGS="-DMLX_BUILD_CUDA=ON" pip install git+https://github.com/ml-explore/mlxInstall mlx-lm:
pip install mlx-lm
Run generation:
mlx_lm.generate --model Qwen/Qwen3-4B-Instruct-2507 --prompt "Tell me a story about Einstein"
LoRA fine-tune:
mlx_lm.lora --model Qwen/Qwen3-4B-Instruct-2507 --data mlx-community/WikiSQL --train
Awesome, thanks! Struggled with this until I found your solution.