[eazy][inductor] fix typo in mm max-autotune log (#97486)
max autotune log like
```
AUTOTUNE bias_addmm(512x197951, 512x512, 512x197951)
triton_mm_61 1.2882s 100.0%
triton_mm_62 1.3036s 98.8%
bias_addmm 1.4889s 86.5%
triton_mm_60 1.6159s 79.7%
triton_mm_63 1.7060s 75.5%
triton_mm_64 1.7777s 72.5%
triton_mm_67 1.9722s 65.3%
addmm 2.0603s 62.5%
triton_mm_70 2.0675s 62.3%
triton_mm_68 2.3552s 54.7%
SingleProcess AUTOTUNE takes 2.949904441833496 seconds
```
is confusion since the sum of runtime of all the kernels is larger than the total time used for tuning. In fact, `triton.testing.do_bench` return milliseconds scale time rather than seconds scale. Fix the typo in the log message to make that clear.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/97486
Approved by: https://github.com/ngimel, https://github.com/jansel
diff --git a/torch/_inductor/select_algorithm.py b/torch/_inductor/select_algorithm.py
index 7697c6c..0a7d7f2 100644
--- a/torch/_inductor/select_algorithm.py
+++ b/torch/_inductor/select_algorithm.py
@@ -807,12 +807,14 @@
sys.stderr.write(f"AUTOTUNE {name}({sizes})\n")
for choice in top_k:
result = timings[choice]
- sys.stderr.write(f" {choice.name} {result:.4f}s {best_time/result:.1%}\n")
+ sys.stderr.write(
+ f" {choice.name} {result:.4f} ms {best_time/result:.1%}\n"
+ )
autotune_type_str = (
"SubProcess" if config.autotune_in_subproc else "SingleProcess"
)
- sys.stderr.write(f"{autotune_type_str} AUTOTUNE takes {elapse} seconds\n")
+ sys.stderr.write(f"{autotune_type_str} AUTOTUNE takes {elapse:.4f} seconds\n")
@staticmethod
def benchmark_example_value(node):