[eazy][inductor] fix typo in mm max-autotune log (#97486)

max autotune log like
```
AUTOTUNE bias_addmm(512x197951, 512x512, 512x197951)
  triton_mm_61 1.2882s 100.0%
  triton_mm_62 1.3036s 98.8%
  bias_addmm 1.4889s 86.5%
  triton_mm_60 1.6159s 79.7%
  triton_mm_63 1.7060s 75.5%
  triton_mm_64 1.7777s 72.5%
  triton_mm_67 1.9722s 65.3%
  addmm 2.0603s 62.5%
  triton_mm_70 2.0675s 62.3%
  triton_mm_68 2.3552s 54.7%
SingleProcess AUTOTUNE takes 2.949904441833496 seconds
```
is confusion since the sum of runtime of all the kernels is larger than the total time used for tuning. In fact, `triton.testing.do_bench` return milliseconds scale time rather than seconds scale. Fix the typo in the log message to make that clear.

Pull Request resolved: https://github.com/pytorch/pytorch/pull/97486
Approved by: https://github.com/ngimel, https://github.com/jansel
diff --git a/torch/_inductor/select_algorithm.py b/torch/_inductor/select_algorithm.py
index 7697c6c..0a7d7f2 100644
--- a/torch/_inductor/select_algorithm.py
+++ b/torch/_inductor/select_algorithm.py
@@ -807,12 +807,14 @@
         sys.stderr.write(f"AUTOTUNE {name}({sizes})\n")
         for choice in top_k:
             result = timings[choice]
-            sys.stderr.write(f"  {choice.name} {result:.4f}s {best_time/result:.1%}\n")
+            sys.stderr.write(
+                f"  {choice.name} {result:.4f} ms {best_time/result:.1%}\n"
+            )
 
         autotune_type_str = (
             "SubProcess" if config.autotune_in_subproc else "SingleProcess"
         )
-        sys.stderr.write(f"{autotune_type_str} AUTOTUNE takes {elapse} seconds\n")
+        sys.stderr.write(f"{autotune_type_str} AUTOTUNE takes {elapse:.4f} seconds\n")
 
     @staticmethod
     def benchmark_example_value(node):