Revert "Don't include traceme.h for ROCM."

This reverts commit 22a97f1ef5a8b10c72ba1ec8b463f063c6e5f22d.

THe reverted commit attempts to fix the ROCm build, but fails to do so. It merely trades bazel dependency error for compile time errors like the following:

```
tensorflow/core/nccl/nccl_manager.cc: In member function 'void tensorflow::NcclManager::LoopKernelLaunches(tensorflow::NcclManager::NcclStream*)':
tensorflow/core/nccl/nccl_manager.cc:689:9: error: 'profiler' has not been declared
         profiler::TraceMe trace_me("ncclAllReduce");
         ^
tensorflow/core/nccl/nccl_manager.cc:718:9: error: 'profiler' has not been declared
         profiler::TraceMe trace_me("ncclBroadcast");
         ^
tensorflow/core/nccl/nccl_manager.cc:729:9: error: 'profiler' has not been declared
         profiler::TraceMe trace_me("ncclReduce");
         ^
...
```
diff --git a/tensorflow/core/nccl/nccl_manager.cc b/tensorflow/core/nccl/nccl_manager.cc
index c3d6af9..2d799d9 100644
--- a/tensorflow/core/nccl/nccl_manager.cc
+++ b/tensorflow/core/nccl/nccl_manager.cc
@@ -21,9 +21,9 @@
 #include "tensorflow/core/lib/core/refcount.h"
 #include "tensorflow/core/lib/core/threadpool.h"
 #include "tensorflow/core/platform/env.h"
+#include "tensorflow/core/profiler/lib/traceme.h"
 #if GOOGLE_CUDA
 #include "tensorflow/core/platform/cuda.h"
-#include "tensorflow/core/profiler/lib/traceme.h"
 #elif TENSORFLOW_USE_ROCM
 #include "tensorflow/core/platform/rocm.h"
 #endif