tree: d160ce12abd6c6ddd8373dfca85f8d43da3989aa [path history] [tgz]
  1. passes/
  2. attributes.h
  3. code_template.h
  4. export.cpp
  5. export.h
  6. fusion_compiler.cpp
  7. fusion_compiler.h
  8. generic_if.h
  9. graph_node_list.h
  10. init.cpp
  11. init.h
  12. interned_strings.cpp
  13. interned_strings.h
  14. interpreter.cpp
  15. interpreter.h
  16. interpreter_autograd_function.cpp
  17. interpreter_autograd_function.h
  18. ir.cpp
  19. ir.h
  20. pybind.h
  21. python_arg_flatten.cpp
  22. python_arg_flatten.h
  23. python_compiled_function.cpp
  24. python_compiled_function.h
  25. python_ir.cpp
  26. python_ir.h
  27. python_tracer.cpp
  28. python_tracer.h
  29. README.md
  30. resource_guard.h
  31. test_jit.cpp
  32. tracer.cpp
  33. tracer.h
  34. tracer_state.h
  35. type.cpp
  36. type.h
  37. variable_flags.cpp
  38. variable_flags.h
torch/csrc/jit/README.md

jit

The jit directory contains infrastructure for a just-in-time compiler for PyTorch.

TODO: Describe the general philosophy of the JIT.

Well-known functions

Ordinarily, when defining a compiler you want the set of functions to be user extensible; e.g., a user can add to the set of defined functions by defining an appropriate autograd Function. However, there are some functions where we want to make assumptions about their semantics, because we are going to write optimizations over them or insert them into the program. Such functions are “well-known” functions, because the JIT compiler knows about them, and a user implementation must abide by the contract (sometimes implicitly) specified by the compiler.

A well-known function is usually implemented in several parts:

  • First, we pre-intern the string (interned_strings.h) that identifies the node. This allows us to more conveniently refer to these operators without having to first do a lookup through the intern table.

  • If we generate this operator during optimizations, we will often have a helper function in Graph (ir.h) for creating the operator. This is the easiest way to find out, in code, what attributes we assume for an operator.

  • There is a runtime interpretation of the operator in torch/csrc/autograd/functions/interpreter.cpp, which specifies how we actually interpret programs that contain such an operator.

So, whence the specifications! For the most part, we are following the ONNX operator specification to determine the semantics of our operators. However, there are a few other well-known functions which are specific to PyTorch.

  • FusionGroup

    A fusion group takes some number of input tensors, applies a graph Subgraph to them, producing the returned tensors of the subgraph. Operationally, operators inside a FusionGroup are fused into a single kernel, so that their intermediate results are never materialized. Not all operators support fusion:

    • attribute:
    • input: 1 - ∞ (same as inputs of Subgraph)
    • output: 1 - ∞ (same as outputs of Subgraph)
  • Eval (renders as CppOp[N5torch8autograd4EvalE])

    An Eval node takes some inputs, and an autograd closure Handle. It applies those inputs to the autograd closure, and returns the results of having executed the closure. An Eval node is primarily used to implement backwards operations for black box forward operations: because the backwards computation of a black box forwards is not known until we actually execute the forward operation, we have to run the forward computation, giving us an autograd closure to compute backwards, and then run it later when we actually execute backwards.

    • input:
    • output: 1 - ∞ (same as outputs of autograd closure)