Enable test_upsamplingNearest2d_launch_fail on ROCm (#36624)

Summary:
The test case exercised in `test_upsamplingNearest2d_launch_fail` will fail on ROCm. The max. grid size per dimension for ROCm are 4294967295(0xffffffff), which is why the tensor dims in `test_upsamplingNearest2d_launch_fail` must give correct results.
This PR adds that test case `test_upsamplingNearest2d_launch_rocm` for ONLY ROCm scenario which is essentially the same as `test_upsamplingNearest2d_launch_fail` without an expected failure decorator

ezyang iotamudelta
Pull Request resolved: https://github.com/pytorch/pytorch/pull/36624

Differential Revision: D21050330

Pulled By: ezyang

fbshipit-source-id: d7370c97eaab98f382f97052ed39cc168a3bfa71
diff --git a/test/test_nn.py b/test/test_nn.py
index 8be3069..1a6cba0 100644
--- a/test/test_nn.py
+++ b/test/test_nn.py
@@ -9641,6 +9641,14 @@
         out = m(inp)
 
     @onlyCUDA
+    @skipCUDAIfNotRocm
+    def test_upsamplingNearest2d_launch_rocm(self, device):
+        # test_upsamplingNearest2d_launch_fail should run OK on ROCm
+        m = nn.Upsample(scale_factor=2)
+        inp = torch.rand(1, 1, 2**15, 2**8, device=device)
+        out = m(inp)
+
+    @onlyCUDA
     @skipCUDAIfCudnnVersionLessThan(7600)
     def test_CTCLoss_cudnn(self, device):
         target_lengths = [30, 25, 20]