[aot] disable inference view tracking (#96478)

For inference, we should disable unnecessary view tracking to save time. Most of operators get an improvement of performance (inductor v.s. eager). This PR fix the general regression of operators for inductor.

Example of operators' speedup in torchbench (inductor v.s. eager):
<html xmlns:v="urn:schemas-microsoft-com:vml"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns="http://www.w3.org/TR/REC-html40">

<head>

<meta name=ProgId content=Excel.Sheet>
<meta name=Generator content="Microsoft Excel 15">
<link id=Main-File rel=Main-File
href="file:///C:/Users/xuanliao/AppData/Local/Temp/msohtmlclip1/01/clip.htm">
<link rel=File-List
href="file:///C:/Users/xuanliao/AppData/Local/Temp/msohtmlclip1/01/clip_filelist.xml">
</head>

<body link="#0563C1" vlink="#954F72">

  | current | new
-- | -- | --
aten.hardsigmoid.default | [0.6426090814905988, 0.6791992931354925, 0.7046010955095103] | [0.7921782106271767, 0.8919522525991529, 0.9128089963571694]
aten.tanh.default | [0.6135534976747065, 0.7588851221588919, 0.898274076411234] | [0.857534066531159, 1.0524121834821605, 1.2535141671420165]
aten.floor.default | [0.6115868728087821, 0.6115868728087821, 0.6115868728087821] | [0.9472870784346195, 0.9472870784346195, 0.9472870784346195]
aten.exp.default | [0.7784016216625718, 0.9279358274876591, 1.1201178548406794] | [0.5777145055206203, 0.8610140436473923, 1.1850714193498957]
aten.mul_.Tensor | [0.14381872531802153, 0.14638969818507447,   0.14947766446663138] | [0.37695307573466363, 0.3832122689450142, 0.38963470437456904]
aten.hardtanh_.default | [0.49502896822398157, 0.5897512505705527, 0.8052969399847189] | [0.4915338157706071, 0.6098169585316151, 0.8587605051115021]
aten.relu_.default | [0.47776870021339685, 0.54452322796367, 0.6516167164223963] | [0.4764791289773786, 0.5608095328163419, 0.6753350976452626]

</body>

</html>

Pull Request resolved: https://github.com/pytorch/pytorch/pull/96478
Approved by: https://github.com/EikanWang, https://github.com/jansel, https://github.com/jgong5, https://github.com/bdhirsh
diff --git a/torch/_functorch/aot_autograd.py b/torch/_functorch/aot_autograd.py
index d1e843b..c2ab300 100644
--- a/torch/_functorch/aot_autograd.py
+++ b/torch/_functorch/aot_autograd.py
@@ -2073,7 +2073,14 @@
         compiled_fn = make_boxed_func(compiled_fn)
 
     def runtime_wrapper(*args):
-        with torch.autograd._force_original_view_tracking(True):
+        if trace_joint:
+            with torch.autograd._force_original_view_tracking(True):
+                all_outs = call_func_with_args(
+                    compiled_fn,
+                    args,
+                    disable_amp=True,
+                )
+        else:
             all_outs = call_func_with_args(
                 compiled_fn,
                 args,