传统运维 - GPU重装

2022-04-13

https://help.aliyun.com/document_detail/163825.htm

卸载GPU驱动。

/usr/bin/nvidia-uninstall

卸载CUDA和cuDNN库。

/usr/local/cuda/bin/cuda-uninstaller rm -rf /usr/local/cuda*

重启服务器

reboot

安装gpu驱动

#!/bin/sh #Please input version to install IS_INSTALL_RDMA="FALSE" IS_INSTALL_AIACC_TRAIN="FALSE" IS_INSTALL_AIACC_INFERENCE="FALSE" DRIVER_VERSION="460.91.03" CUDA_VERSION="10.2.89" CUDNN_VERSION="7.6.5" IS_INSTALL_RAPIDS="FALSE" INSTALL_DIR="/root/auto_install" rm -rf ${INSTALL_DIR}* #using .deb to install driver and cuda on ubuntu OS #using .run to install driver and cuda on ubuntu OS auto_install_script="auto_install.sh" script_download_url=$(curl http://100.100.100.200/latest/meta-data/source-address | head -1)"/opsx/ecs/linux/binary/script/${auto_install_script}" echo $script_download_url mkdir $INSTALL_DIR && cd $INSTALL_DIR wget -t 10 --timeout=10 $script_download_url && sh ${INSTALL_DIR}/${auto_install_script} $DRIVER_VERSION $CUDA_VERSION $CUDNN_VERSION $IS_INSTALL_AIACC_TRAIN $IS_INSTALL_AIACC_INFERENCE $IS_INSTALL_RDMA $IS_INSTALL_RAPIDS

测试效果

nvidia-smi