[PyTorch][Vulkan] Clean up aten::unsqueeze (#118311)

Summary:
After D50347338, we already support zero-dim tensor input, which was my original task. As a result, this diff doesn't add or change functionality; it just cleans up the following:
1. Fix TORCH_CHECK to only allow `tensor.dim() <= 3`. Previously, it was a no-op since it didn't use `&&`.
2. Add 0->1 `tensor.dim()` tests.
3. Remove `dim == 0` case from shader since that path is never executed. The `cpp` code sends the input to `submit_copy` instead.

Test Plan:
Tested on OD.
```
[jorgep31415@29786.od /data/sandcastle/boxes/fbsource (c66693c95)]$ LD_LIBRARY_PATH=third-party/swiftshader/lib/linux-x64/ buck2 run fbcode/mode/dev-nosan //xplat/caffe2:pt_vulkan_api_test_bin -c pt.vulkan_full_precision=1 -- --gtest_filter="*unsqueeze*"
File changed: fbcode//caffe2/aten/src/ATen/native/vulkan/glsl/unsqueeze.glsl
File changed: fbsource//xplat/caffe2/aten/src/ATen/test/vulkan_api_test.cpp
File changed: fbsource//xplat/caffe2/aten/src/ATen/native/vulkan/glsl/unsqueeze.glsl
Buck UI: https://www.internalfb.com/buck2/16cf8f59-e535-493b-b123-5952ef8f1453
Network: Up: 21KiB  Down: 1.4MiB  (reSessionID-1219eefd-e78b-4bfd-aef8-8e4b38da82f8)
Jobs completed: 8. Time elapsed: 37.8s.
Cache hits: 0%. Commands: 3 (cached: 0, remote: 1, local: 2)
BUILD SUCCEEDED
Running main() from third-party/googletest/1.14.0/googletest/googletest/src/gtest_main.cc
Note: Google Test filter = *unsqueeze*
[==========] Running 10 tests from 1 test suite.
[----------] Global test environment set-up.
[----------] 10 tests from VulkanAPITest
[ RUN      ] VulkanAPITest.unsqueeze_0dto1d_dim0
[       OK ] VulkanAPITest.unsqueeze_0dto1d_dim0 (61 ms)
[ RUN      ] VulkanAPITest.unsqueeze_1dto2d_dim0
[       OK ] VulkanAPITest.unsqueeze_1dto2d_dim0 (0 ms)
[ RUN      ] VulkanAPITest.unsqueeze_1dto2d_dim1
[       OK ] VulkanAPITest.unsqueeze_1dto2d_dim1 (110 ms)
[ RUN      ] VulkanAPITest.unsqueeze_2dto3d_dim0
[       OK ] VulkanAPITest.unsqueeze_2dto3d_dim0 (16 ms)
[ RUN      ] VulkanAPITest.unsqueeze_2dto3d_dim1
[       OK ] VulkanAPITest.unsqueeze_2dto3d_dim1 (58 ms)
[ RUN      ] VulkanAPITest.unsqueeze_2dto3d_dim2
[       OK ] VulkanAPITest.unsqueeze_2dto3d_dim2 (2 ms)
[ RUN      ] VulkanAPITest.unsqueeze_3dto4d_dim0
[       OK ] VulkanAPITest.unsqueeze_3dto4d_dim0 (16 ms)
[ RUN      ] VulkanAPITest.unsqueeze_3dto4d_dim1
[       OK ] VulkanAPITest.unsqueeze_3dto4d_dim1 (1 ms)
[ RUN      ] VulkanAPITest.unsqueeze_3dto4d_dim2
[       OK ] VulkanAPITest.unsqueeze_3dto4d_dim2 (1 ms)
[ RUN      ] VulkanAPITest.unsqueeze_3dto4d_dim3
[       OK ] VulkanAPITest.unsqueeze_3dto4d_dim3 (1 ms)
[----------] 10 tests from VulkanAPITest (270 ms total)

[----------] Global test environment tear-down
[==========] 10 tests from 1 test suite ran. (270 ms total)
[  PASSED  ] 10 tests.

```

Also, to improve my confidence in unit tests, I modified [force_flush.py](https://www.internalfb.com/code/fbsource/[6e606c6f62dafd2121e78ffe14ae12f1b6d8d405]/fbcode/wearables/camera/ml/pytorch_vulkan_native/demo/force_flush.py) to run several combinations of `aten::unsqueeze` on OD.

Verified these work as expected.
```
torch.zeros([])
torch.randn([])
torch.rand([])
torch.ones([])
torch.tensor(0, dtype=torch.float)
```

Found that Vulkan in general does not support the following. That's ok though since it's technically a 1d tensor which is not part of my task.
```
torch.tensor([])
```

Differential Revision: D53071189

Pull Request resolved: https://github.com/pytorch/pytorch/pull/118311
Approved by: https://github.com/liuk22
diff --git a/aten/src/ATen/native/vulkan/glsl/unsqueeze.glsl b/aten/src/ATen/native/vulkan/glsl/unsqueeze.glsl
index fb44ac9..8512ca8 100644
--- a/aten/src/ATen/native/vulkan/glsl/unsqueeze.glsl
+++ b/aten/src/ATen/native/vulkan/glsl/unsqueeze.glsl
@@ -38,9 +38,7 @@
   const int dim = uBlock.info.x;
   const int channels = uBlock.info.y;
   vec4 out_texel = vec4(0, 0, 0, 0);
-  if (dim == 0) {
-    imageStore(uOutput, pos, texelFetch(uImage, pos, 0));
-  } else if (dim == 1) {
+  if (dim == 1) {
     int src_x = pos.x;
     int src_y = pos.y;
     for (int i = 0; i < 4; i++) {
diff --git a/aten/src/ATen/native/vulkan/ops/Unsqueeze.cpp b/aten/src/ATen/native/vulkan/ops/Unsqueeze.cpp
index f4aef34..9378474 100644
--- a/aten/src/ATen/native/vulkan/ops/Unsqueeze.cpp
+++ b/aten/src/ATen/native/vulkan/ops/Unsqueeze.cpp
@@ -16,8 +16,8 @@
 
 Tensor unsqueeze(const at::Tensor& self, int64_t dim) {
   TORCH_CHECK(
-      self.dim() >= 1 || self.dim() <= 3,
-      "Vulkan unsqueeze supports 1d, 2d, 3d tensors as input!");
+      self.dim() <= 3,
+      "Vulkan unsqueeze only supports up to 3d tensors as input!");
   TORCH_CHECK(
       dim >= -self.dim() - 1 && dim <= self.dim(),
       "Vulkan unsqueeze dimension out of range expected to be in range of [",
diff --git a/aten/src/ATen/test/vulkan_api_test.cpp b/aten/src/ATen/test/vulkan_api_test.cpp
index 5b59fa8..246d705 100644
--- a/aten/src/ATen/test/vulkan_api_test.cpp
+++ b/aten/src/ATen/test/vulkan_api_test.cpp
@@ -5433,6 +5433,11 @@
   ASSERT_TRUE(check);
 }
 
+TEST_F(VulkanAPITest, unsqueeze_0dto1d_dim0) {
+  test_unsqueeze({}, 0);
+  test_unsqueeze({}, -1);
+}
+
 TEST_F(VulkanAPITest, unsqueeze_1dto2d_dim0) {
   test_unsqueeze({5}, 0);
   test_unsqueeze({6}, -2);