Use computeStorageNbytesContiguous if possible (#128515)

```at::detail::computeStorageNbytesContiguous``` does fewer data-dependent tests compared to ```at::detail::computeStorageNbytes```. Therefore, use of former is more likely to succeed with dynamic shapes. This PR detects is_contiguous and dispatches to the appropriate function. This should be helpful in unblocking aot_eager for torchrec. As an aside, this is an alternative solution to the unsound solution I had first proposed in another [PR](#128141).
Pull Request resolved: https://github.com/pytorch/pytorch/pull/128515
Approved by: https://github.com/ezyang
diff --git a/aten/src/ATen/native/TensorShape.cpp b/aten/src/ATen/native/TensorShape.cpp
index adcddea..3a47349 100644
--- a/aten/src/ATen/native/TensorShape.cpp
+++ b/aten/src/ATen/native/TensorShape.cpp
@@ -420,8 +420,9 @@
     // it.  TODO: Actually this might not quite be correct if we use special
     // pointers to track whether or not fake cuda tensors are pinned or not
     const auto itemsize = result.dtype().itemsize();
-    c10::SymInt new_size_bytes = at::detail::computeStorageNbytes(
-        size, stride, itemsize, std::move(storage_offset));
+    c10::SymInt new_size_bytes = result.is_contiguous()
+      ? at::detail::computeStorageNbytesContiguous(size, itemsize, std::move(storage_offset))
+      : at::detail::computeStorageNbytes(size, stride, itemsize, std::move(storage_offset));
     // TODO: When there are unbacked SymInts, we unconditionally skip the
     // setter.  This is technically wrong, but we cannot conveniently test
     // the real condition in many cases, because a lot of people are using