Disable cuDNN persistent RNN on A30 (#59830)
Summary:
https://github.com/pytorch/pytorch/issues/59829
cherry-picked from ptrblck 's change CC ngimel xwang233
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59830
Reviewed By: bdhirsh
Differential Revision: D29046145
Pulled By: ngimel
fbshipit-source-id: 270ab3bb6c1c7c759497a15eb38b20a177c94adb
diff --git a/aten/src/ATen/native/cudnn/RNN.cpp b/aten/src/ATen/native/cudnn/RNN.cpp
index da83099..f81de80 100644
--- a/aten/src/ATen/native/cudnn/RNN.cpp
+++ b/aten/src/ATen/native/cudnn/RNN.cpp
@@ -726,7 +726,8 @@
(tensors.seq_length >=20 && bsize <=96) ||
(tensors.seq_length >=10 && bsize <=32));
}
- } else if (prop->major >= 8) {
+ } else if (prop->major >= 8 && prop->multiProcessorCount >= 98) {
+ // SM count check excludes A30 (similar issue to A40)
if (prop->minor == 6) {
// Excludes sm_86 GPU devices from using persistent rnn.
// This is because there are some edge cases that will throw exceptions with cudnn 8.0.5 on Nvidia A40 GPU.