[torch.compile][ci] Flaky models in CI (similar to DISABLED_TEST) (#128715) These models are really flaky. I went into the CI machine and ran the model many times, sometime it fails, sometimes it passes. Even Pytorch-eager results change from run to run, so the accuracy comparison is fundamentally broken/non-deterministic. I am hitting these issues more frequently in inlining work. There is nothing wrong with inlining, I think these models are on the edge of already-broken accuracy measurement, and inlining is just pushing it in more broken direction. Pull Request resolved: https://github.com/pytorch/pytorch/pull/128715 Approved by: https://github.com/eellison

commit: 9c773321160fbf9b578f4ab2cfc592d76f4f89e8 [log] [tgz]
author: Animesh Jain <anijain@umich.edu> Fri Jun 14 09:41:11 2024 -0700
committer: PyTorch MergeBot <pytorchmergebot@users.noreply.github.com> Fri Jun 14 20:17:03 2024 +0000
tree: a1319d23dac5c36defcb5b18a56a7535f2b85317
parent: 2e5366fbc04f819c62612e8c56fb786b43c1c67d [diff]
diff --git a/benchmarks/dynamo/check_accuracy.py b/benchmarks/dynamo/check_accuracy.py
index da82f78..8cbc186 100644
--- a/benchmarks/dynamo/check_accuracy.py
+++ b/benchmarks/dynamo/check_accuracy.py

@@ -6,6 +6,14 @@
 import pandas as pd
 
 
+# Hack to have something similar to DISABLED_TEST. These models are flaky.
+
+flaky_models = {
+    "yolov3",
+    "gluon_inception_v3",
+}
+
+
 def get_field(csv, model_name: str, field: str):
     try:
         return csv.loc[csv["name"] == model_name][field].item()
@@ -25,6 +33,13 @@
             status = "PASS" if expected_accuracy == "pass" else "XFAIL"
             print(f"{model:34}  {status}")
             continue
+        elif model in flaky_models:
+            if accuracy == "pass":
+                # model passed but marked xfailed
+                status = "PASS_BUT_FLAKY:"
+            else:
+                # model failed but marked passe
+                status = "FAIL_BUT_FLAKY:"
         elif accuracy != "pass":
             status = "FAIL:"
             failed.append(model)
commit	9c773321160fbf9b578f4ab2cfc592d76f4f89e8	[log] [tgz]
author	Animesh Jain <anijain@umich.edu>	Fri Jun 14 09:41:11 2024 -0700
committer	PyTorch MergeBot <pytorchmergebot@users.noreply.github.com>	Fri Jun 14 20:17:03 2024 +0000
tree	a1319d23dac5c36defcb5b18a56a7535f2b85317
parent	2e5366fbc04f819c62612e8c56fb786b43c1c67d [diff]