docker使用--gpus all报错:
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].
解决方案:安装nvidia-container-toolkit
或nvidia-container-runtime(可能无法定位某个包)
一、检查驱动
lspci -vv | grep -i nvidia
确保驱动没问题
二、添加github的源(可先尝试第三步,如果无法定位包,再返回第二步)
注意:一起复制有问题就逐行复制
2.1 安装nvidia-container-toolkit
添加依赖,并进行安装,最后重启docekr
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker
2.2 安装NVIDIA Container Runtime
sudo curl -s -L https://nvidia.github.io/nvidia-container-runtime/gpgkey | \
sudo apt-key add -
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
sudo curl -s -L https://nvidia.github.io/nvidia-container-runtime/$distribution/nvidia-container-runtime.list | \
sudo tee /etc/apt/sources.list.d/nvidia-container-runtime.list
sudo apt-get update
apt-get install nvidia-container-runtime
sudo systemctl restart docker