[ROCm] Fix for ROCm CSB breakage - 210311

The following commit breaks the ROCm CSB (errors while running GPU unit tests)

https://github.com/tensorflow/tensorflow/pull/47398

```
ERROR: /root/tensorflow/tensorflow/compiler/xla/service/gpu/tests/BUILD:573:13: undeclared inclusion(s) in rule '//tensorflow/compiler/xla/service/gpu/tests:hlo_to_llvm_ir':
this rule is missing dependency declarations for the following files included by 'tensorflow/compiler/xla/service/gpu/tests/hlo_to_llvm_ir.cc':
  'tensorflow/compiler/xla/service/gpu/nvptx_compiler.h'
Target //tensorflow/compiler/xla/service/gpu/tests:constant.hlo.test failed to build
```

This PR/commit fixes the ROCm build, by making the newly added functionality specific to the TF CUDA build.

Long term, we need to add similar capability to the ROCM side, but for now this workaround is needed to get the ROCm CSB working again.
diff --git a/tensorflow/compiler/xla/service/gpu/tests/hlo_to_llvm_ir.cc b/tensorflow/compiler/xla/service/gpu/tests/hlo_to_llvm_ir.cc
index 47f18c8..7a36011 100644
--- a/tensorflow/compiler/xla/service/gpu/tests/hlo_to_llvm_ir.cc
+++ b/tensorflow/compiler/xla/service/gpu/tests/hlo_to_llvm_ir.cc
@@ -17,7 +17,9 @@
 #include "tensorflow/compiler/xla/service/gpu/gpu_compiler.h"
 #include "tensorflow/compiler/xla/service/gpu/gpu_device_info.h"
 #include "tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.h"
+#if GOOGLE_CUDA
 #include "tensorflow/compiler/xla/service/gpu/nvptx_compiler.h"
+#endif
 #include "tensorflow/compiler/xla/service/gpu/target_constants.h"
 #include "tensorflow/compiler/xla/service/hlo_module.h"
 #include "tensorflow/compiler/xla/status.h"
@@ -77,6 +79,7 @@
   if (!generate_ptx) {
     llvm_module->print(llvm::outs(), nullptr);
   } else {
+#if GOOGLE_CUDA
     std::pair<int, int> gpu_version = std::make_pair(
         cuda_compute_capability.cc_major, cuda_compute_capability.cc_minor);
     std::string libdevice_dir = xla::gpu::GetLibdeviceDir(hlo_module->config());
@@ -85,6 +88,10 @@
         xla::gpu::nvptx::CompileToPtx(llvm_module.get(), gpu_version,
                                       hlo_module->config(), libdevice_dir));
     std::cout << ptx << std::endl;
+#else
+    return xla::Status(tensorflow::error::UNIMPLEMENTED,
+                       "Feature not yet implemented in ROCm");
+#endif
   }
   return xla::Status::OK();
 }