Python报错:cuDNN error: CUDNN_STATUS_NOT_INITIALIZED 和 CUDA_LAUNCH_BLOCKING=1.

安装服务器环境的血泪史!

最近在3090服务器上搭建了一个环境用于跑Padim的程序,发现自己制作的数据集跑不了,期间遇到了很多bug。

  1. pip和conda都不能使用,下载包都不行,试了很多种方式,还是删除重新安装一遍anaconda最见效。
  2. 配置好的环境,没换源的话,很卡,下载速度龟速,尝试网上的命令行换源,效果很不好,参考了一个更改condarc的文件的博主(多深度学习框架融合环境),效果很好,直接就能使用,但是有一些可能下载慢,那就带源下载。
  3. 安装好之后,又遇到了
    RuntimeError: CUDA error: no kernel image is available for execution on the device CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
    这个问题,查找资料之后,有的说法大概就是3090显卡对于CUDA是有要求的,我一开始用的是cuda10.2版本的pytorch,3090最好是使用cuda11.x版本的pytorch
  4. 因为我的服务器cuda是11.2就在官网上找,结果没有发现对应的版本的pytorch,异想天开的随便选择了 一个低版本的pytorch,但是遇到了很不理解的问题,代码可以直接运行,但是当我想分步调试的时候,代码就会在模型运算的地方报错:cuDNN error: CUDNN_STATUS_NOT_INITIALIZED很烦,找了很多资料,大概意思有很多种,最可能的是,cuda和pytorch版本不匹配,我醉了,又在官网上找了好几个版本的pytorch,但是没有一个成功
model [CycleGANModel] was created ---------- Networks initialized ------------- [Network G_A] Total number of parameters : 11.378 M [Network G_B] Total number of parameters : 11.378 M [Network D_A] Total number of parameters : 2.765 M [Network D_B] Total number of parameters : 2.765 M ----------------------------------------------- Setting up a new session... create web directory ./checkpoints\horse2zebra_cyclegan\web... D:\N 1\anaconda\Lib\site-packages\torch\optim\lr_scheduler.py:227: UserWarning: Detected call of `lr_scheduler.step()` before `optimizer.step()`. In PyTorch 1.1.0 and later, you should call them in the opposite order: `optimizer.step()` before `lr_scheduler.step()`. Failure to do this will result in PyTorch skipping the first value of the learning rate schedule. See more details at https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate warnings.warn( learning rate 0.0002000 -> 0.0002000 Traceback (most recent call last): File "E:\pytorch-CycleGAN-and-pix2pix-master\pytorch-CycleGAN-and-pix2pix-master\train.py", line 52, in <module> model.optimize_parameters() # calculate loss functions, get gradients, update network weights ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "E:\pytorch-CycleGAN-and-pix2pix-master\pytorch-CycleGAN-and-pix2pix-master\models\cycle_gan_model.py", line 187, in optimize_parameters self.backward_G() # calculate gradients for G_A and G_B ^^^^^^^^^^^^^^^^^ File "E:\pytorch-CycleGAN-and-pix2pix-master\pytorch-CycleGAN-and-pix2pix-master\models\cycle_gan_model.py", line 178, in backward_G self.loss_G.backward() File "D:\N 1\anaconda\Lib\site-packages\torch\_tensor.py", line 626, in backward torch.autograd.backward( File "D:\N 1\anaconda\Lib\site-packages\torch\autograd\__init__.py", line 347, in backward _engine_run_backward( File "D:\N 1\anaconda\Lib\site-packages\torch\autograd\graph.py", line 823, in _engine_run_backward return Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ RuntimeError: cuDNN error: CUDNN_STATUS_INTERNAL_ERROR_HOST_ALLOCATION_FAILED
05-11
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值