From the root of pytorch repo, run:
python -m benchmarks.tensorexpr --help
to show documentation.
An example of an actual command line that one might use as a starting point:
python -m benchmarks.tensorexpr --device gpu --mode fwd --jit_mode trace --cuda_fuser=te