torch/csrc/jit/docs/serialization.md - platform/external/pytorch - Git at Google

 # TorchScript serialization

 This document explains the TorchScript serialization format, and the anatomy
 of a call to `torch::jit::save()` or `torch::jit::load()`.

 <!-- toc -->

 - [Overview](#overview)
   - [Design Notes](#design-notes)
 - [`code/`: How code is serialized](#code-how-code-is-serialized)
   - [Printing code objects as Python source](#printing-code-objects-as-python-source)
   - [Placing the source code in the archive](#placing-the-source-code-in-the-archive)
 - [How data is serialized](#how-data-is-serialized)
   - [`data.pkl`: How module object state is serialized](#datapkl-how-module-object-state-is-serialized)
   - [`data/`: How tensors are serialized](#data-how-tensors-are-serialized)
 - [`constants.pkl`: Constants in code](#constantspkl-constants-in-code)
 - [`torch:jit::load()`](#torchjitload)
 - [`__getstate__` and `__setstate__`](#__getstate__-and-__setstate__)
 - [Appendix: `CompilationUnit` and code object ownership](#appendix-compilationunit-and-code-object-ownership)
   - [`CompilationUnit` ownership semantics](#compilationunit-ownership-semantics)
   - [Code object naming](#code-object-naming)

 <!-- tocstop -->

 ## Overview

 A serialized model (call it `model.pt`) is a ZIP archive containing many
 files. If you want to manually crack it open, you can call `unzip` on it to
 inspect the file structure directly:

 ```
 $ unzip model.pt
 Archive:  model.pt
   extracting ...

 $ tree model/
 ├── code/
 │   ├── __torch__.py
 │   ├── __torch__.py.debug_pkl
 │   ├── foo/
 │   │   ├── bar.py
 │   │   ├── bar.py.debug_pkl
 ├── data.pkl
 ├── constants.pkl
 └── data/
     ├── 0
     └── 1
 ```

 You'll notice that there are `.py` and `.pkl` files in this archive. That's
 because our serialization format tries to mimic Python's. All "code-like"
 information (methods, modules, classes, functions) are stored as
 human-readable `.py` containing valid Python syntax, and all "data-like"
 information (attributes, objects, etc.) are pickled using a subset of
 Python's pickle protocol.

 A model is really a top-level module with some submodules, parameters, and so
 on depending on what the author needs. So, `data.pkl` contains the pickled
 top-level module. Deserializing the model is as simple as calling
 `unpickle()` on `data.pkl`, which will restore the module object state and
 load its associated code on demand.

 ### Design Notes

 Some things to keep in mind while working on the serialization code. These
 may help make technical decisions on which approach to take when making a
 change.

 **Do what Python does**. When it comes to the serialized format, it's much
 simpler in the long-run to be consistent with whatever Python does. A good
 rule of thumb is: if I tried to interact with serialized artifacts using
 Python, would it work? i.e., all serialized code should be valid Python, and
 all pickled objects should be depickle-able by Python.

 Being consistent with Python means our format is more debuggable (you can
 always crack it open and poke at it from Python) and leads to fewer surprises
 for developers familiar with Python but not familiar with TorchScript.

 **Human readable**. In addition to being valid Python, serialized code should
 attempt to be readable Python. We should try to preserve the variable names
 that authors wrote, appropriately inline short expressions, and so on. This
 helps with debugging the serialized code.

 **No jitter**. If we do:

 ```
 m = MyModule()
 m.save("foo.pt")
 m_loaded = torch.load("foo.pt")
 m_loaded.save("foo2.pt")
 m_loaded2 = torch.load("foo2.pt")
 ```

 We want the property that `m_loaded` and `m_loaded2` are identical. This
 "no-jitter" property is useful in catching bugs in the serialization process,
 and generally is desirable for debugging (models won't drift depending on how
 many times you saved/loaded them).

 **Initial load should be fast**. Calling `load()` should be effectively
 instantaneous to a human. Anything that takes a long time (reading in tensor
 data, for example) should be done lazily.

 ## `code/`: How code is serialized

 At a high level, code serialization means:

 1. Transforming `ClassType`s and `Function`s (called "code objects") into Python source code.
 2. Placing the source code in the model ZIP archive.

 ### Printing code objects as Python source
 `PythonPrint` is the function that takes as input a `ClassType` or `Function`
 ("code object") and outputs Python source code. `ScriptModule`s are
 implemented as class types, so their methods and attributes will get
 serialized as well.

 `PythonPrint` works by walking a `Graph` (the IR representation of either a
 `ClassType`'s method or raw `Function`) and emitting Python code that
 corresponds to it. The rules for emitting Python code are mostly
 straightforward and uninteresting. There are some extra pieces of information
 that `PythonPrint` tracks, however:

 **Class dependencies**. While walking the graph, `PythonPrint` keeps track of
 what classes are used in the graph and adds them to a list of classes that
 the current code object depends on. For example, if we are printing a
 `Module`, it will depend on its submodules, as well as any classes used in
 its methods or attributes.

 **Uses of tensor constants**. Most constants are inlined as literals, like
 strings or ints. But since tensors are potentially very large, when
 `PythonPrint` encounters a constant tensor it will emit a reference to a
 global `CONSTANTS` table (like `foo = CONSTANTS.c0`).

 When importing, the importer will know how to resolve this reference into an
 actual tensor by looking it up in the tensor table. So `CONSTANTS.c0` means
 "this is the `0th` tensor in the tensor tuple in `constants.pkl`." See
 [the constants section](#constantspkl-constants-in-code) for more info.

 **Original source range records**. To aid debugging, `PythonPrint` remembers
 the "original" (user-written) location of the source code it's emitting. That
 way, when the user is debugging a model they loaded, they will see
 diagnostics that point to the code that they actually wrote, rather than the
 code that `PythonPrint` emitted.

 The original source range records are pickled and saved in a corresponding
 `.debug_pkl` file with the same name as the code. You can think of this
 `.debug_pkl` file as a map between source ranges in the serialized code and
 the original user-written code.

 **Module information**. Modules are special in a few ways. First are
 `Parameter`s: some module attributes are actually `Parameter`s, which have
 special properties (see [the `torch.nn`
 documentation](https://pytorch.org/docs/stable/nn.html#parameters) for exact
 details). We track which attributes are parameters by emitting a special
 assignment in the class body, like:

 ```
 class MyModule(Module):
     __parameters__ = ["foo", "bar", ]
     foo : Tensor
     bar : Tensor
     attribute_but_not_param : Tensor
 ```

 Another special thing with modules is that they are typically constructed in
 Python, and we do not compile the `__init__()` method. So in order to ensure
 they are statically typed, `PythonPrint` must enumerate a module's attributes
 (as you can see above), because it can't rely on compiling `__init__()` to
 infer the attributes.

 A final special thing is that some modules (like `nn.Sequential`) have
 attributes that are not valid Python identifiers. We can't write

 ```
 # wrong!
 class MyModule(Module):
     0 : ASubmodule
     1 : BSubmodule
 ```

 because this is not valid Python syntax (even though it is legal in Python to
 have attributes with those names!). So we use a trick where we write directly
 to the `__annotations__` dict:

 ```
 class MyModule(Module):
     __annotations__ = []
     __annotations__["0"] = ASubmodule
     __annotations__["1"] = ASubmodule
 ```

 ### Placing the source code in the archive

 Once all code objects have been `PythonPrint`ed into source strings, we have
 to figure out where to actually put this source. Explaining this necessitates
 an introduction to `CompilationUnit` and `QualifiedName`. See the appendix on
 `CompilationUnit` for more info.

 **`CompilationUnit`**: this is the owning container for all code objects
 associated with a given model. When we load, we load all the code objects to
 a single `CompilationUnit`.

 **`QualifiedName`**: this is the fully qualified name for a code object. It is
 similar to qualified names in Python, and looks like `"foo.bar.baz"`. Each
 code object has a *unique* `QualifiedName` within a `CompilationUnit`.

 The exporter uses the `QualifiedName` of a code object to determine its
 location in the `code/` folder. The way it does so is similar to how Python
 does it; for example, the class `Baz` with a `QualifiedName` `"foo.bar.Baz"`
 will be placed in `code/foo/bar.py` under the name `Baz`.

 Classes at the root of the hierarchy are given the qualified name `__torch__`
 as a prefix, just so that they can go in `__torch__.py`. (Why not `__main__`?
 Because pickle has weird special rules about things that live in `__main__`).

 That's about it; there's some additional logic to make sure that within a
 file, we place the classes in reverse-dependency order so that we compile the
 "leaf" dependencies before things that depend on them.

 ## How data is serialized

 A model is really a top-level `ScriptModule` with any number of submodules,
 parameters, attributes, and so on. We implement a subset of the Pickle format
 necessary for pickling a module object.

 `pickle`'s format was chosen due to:

 * **user friendliness** - the attributes file can be loaded in Python with `pickle`
 * **size limits** - formats such as Protobuf empose size limits on total
  message size, whereas pickle limits are on individual values (e.g. strings
  cannot be longer than 4 GB)
 * **standard format** - `pickle` is a standard Python module with a reasonably
  simple format. The format is a program to be consumed by a stack machine that
  is detailed in Python's
 * [`pickletools.py`](https://svn.python.org/projects/python/trunk/Lib/pickletools.py)
 * **built-in memoization** - for shared reference types (e.g. Tensor, string,
  lists, dicts)
 * **self describing** - a separate definition file is not needed to understand
  the pickled data
 * **eager mode save** - `torch.save()` already produces a `pickle` archive, so
  doing the same with attributes avoids introducing yet another format

 ### `data.pkl`: How module object state is serialized

 All data is written into the `data.pkl` file with the exception of tensors
 (see [the tensor section](#data-how-tensors-are-serialized) below).
 "Data" means all parts of the module object state, like attributes,
 submodules, etc.

 PyTorch functions defined in [torch/jit/_pickle.py](../../../jit/_pickle.py)
 are used to mark special data types, such as this tensor table index or
 specialized lists.

 ### `data/`: How tensors are serialized

 During export a list of all the tensors in a model is created. Tensors can
 come from either module parameters or attributes of Tensor type.

 Tensors are treated differently from other data (which is pickled using the
 standard pickling process) for a few reasons:

 - Tensors regularly exceed the `pickle` file size limit.
 - We'd like to be able to `mmap` Tensors directly.
 - We'd like to maintain compatibility with regular `PyTorch`'s serialization
   format

 ## `constants.pkl`: Constants in code

 The `pickle` format enforces a separation between data and code, which the
 TorchScript serialization process represents by having `code/` and
 `data.pkl + tensors/`.

 However, TorchScript inlines constants (i.e. `prim::Constant` nodes) directly
 into `code/`. This poses a problem for tensor constants, which are not easily
 representable in string form.

 We can't put tensor constants in `data.pkl`, because the source code must be
 loaded *before* `data.pkl`, and so putting the tensor constants there would
 create a cyclic loading dependency.

 We solve this problem by creating a separate `pickle` file called
 `constants.pkl`, which holds all tensor constants referenced in code. The
 load order will be explained in the next section.

 ## `torch:jit::load()`

 The load process has the following steps:

 1. Unpickle `constants.pkl`, which produces a tuple of all tensor constants
    referenced in code.
 2. Unpickle `data.pkl` into the top-level `Module` and return it.

 The unpickling process consists of a single call to unpickle the module
 object contained in `data.pkl`. The `Unpickler` is given a callback that lets it
 resolved any qualified names it encounters into `ClassType`s. This is done by
 resolving the qualified name to the appropriate file in `code/`, then
 compiling that file and returning the appropriate `ClassType`.

 This is why it's important to give code objects unique qualified names in the
 `CompilationUnit`. That way, every class that `Unpickler` encounters has a
 deterministic location in `code/` where it is stored.

 `Unpickler` is also responsible for resolving references to tensors into
 actual `at::Tensor`s. This is done by looking up offsets in the tensor table
 during the unpickling process, (soon to be replaced with the same pickling
 strategy as all other data).

 ## `__getstate__` and `__setstate__`

 Like in Python's `pickle`, users can customize the pickling behavior of their
 class or module by implementing `__getstate__()` and `__setstate__()`
 methods. For basic usage, refer to the relevant [Python
 docs](https://docs.python.org/3.7/library/pickle.html#pickle-state).

 Calls to `__getstate__` and `__setstate__` are handled transparently by
 `Pickler` and `Unpickler`, so the serialization process shouldn't worry about
 it too much.

 One thing worth calling out is that the compiler implements a few special
 type inference behaviors to cheat the fact that users currently cannot type
 annotate `Module`s.

 `__getstate__` and `__setstate__` do not require type annotations. For
 `__getstate__`, the compiler can fully infer the return based on what
 attributes the user is returning. Then, `__setstate__` simply looks up the
 return type of `__getstate__` and uses that as its input type.

 For example:

 ```
 class M(torch.nn.Module):
     def __init__(self):
         self.a = torch.rand(2, 3)
         self.b = torch.nn.Linear(10, 10)

     def __getstate__(self):
         # Compiler infers that this is a tuple of (Tensor, Linear)
         return (self.a, self.b)

     def __setstate__(self, state):
         # Don't need to annotate this, we know what type `state` is!
         self.a = state[0]
         self.b = state[1]
 ```

 ## Appendix: `CompilationUnit` and code object ownership
 `CompilationUnit` performs two functions:

 1. It is the owner (in a C++ sense) for all code objects.
 2. It forms a namespace in which code objects must have unique names.

 A `CompilationUnit` is created whenever `torch::jit::load()` is invoked, to
 place the newly deserialized code objects in. In Python, there is a single
 global `CompilationUnit` that holds all code objects defined in Python.

 ### `CompilationUnit` ownership semantics
 There are a few different entities that participate in the ownership model:
 **`CompilationUnit`**: A container that owns code objects and gives them name.
 Every code object has a unique qualified name within the CompilationUnit.

 There are two kinds of code objects: `Function`s and `ClassType`s.
 **`Function`**: A `Graph` with an associated executor. The `Graph` may own
 `ClassType`s, since some `Value`s hold a `shared_ptr` to their type (for
 now). The `Graph` may also weakly reference other `Function`s through
 function calls.

 **`ClassType`**: A definition of a type. This could refer to a user-defined
 TorchScript class, or a `ScriptModule`. Owns other its attribute types
 (including other ClassTypes). Weakly references the class’s methods
 (`Function`s).

 **`Object`**: An instance of a particular class. Own the `CompilationUnit`
 that owns its `ClassType`. This is to ensure that if the user passes the
 object around in C++, all its code will stay around and methods will be
 invokable.

 **`Module`**: A view over a `ClassType` and the `Object` that holds its state.
 Also responsible for turning unqualified names (e.g. `forward()`) into
 qualified ones for lookup in the owning `CompilationUnit` (e.g.
 `__torch__.MyModule.forward`). Owns the `Object`, which transitively owns the
 `CompilationUnit`.

 **`Method`**: A tuple of `(Module, Function)`.

 ### Code object naming

 `CompilationUnit` maintains a namespace in which all code objects
 (`ClassType`s and `Function`s) are uniquely named. These names don't have any
 particular meaning, except that they uniquely identify a code object during
 serialization and deserialization. The basic naming scheme is:

 * Everything starts in the `__torch__` namespace.
 * Classes are named parallel to Python’s module namespacing: so class `Bar` in
  `foo.py` would become `__torch__.foo.Bar`.
 * Methods are attached to the module’s namespace. So `Bar.forward()` would be
  `__torch__.foo.Bar.forward`.

 There are some caveats:

 **Some `CompilationUnit`s have no prefix**: For testing and other internal
 purposes, occasionally it’s useful to have no prefixes on names. In this
 case, everything is just a bare name inside the `CompilationUnit`. Users
 cannot construct `CompilationUnits that look like this.

 **Name mangling**: In Python, we can construct code objects that have the same
 qualified name. There are two cases where this happens:

 1. For `ScriptModule`s, since every `ScriptModule` is a singleton class in
 the JIT, a user that is constructing multiple `ScriptModule`s will create
 multiple corresponding `ClassType`s with identical names.
 2. Nesting functions will also cause qualified name clashes, due to
 limitations in Python. In these cases, we mangle the names of the code
 objects before they are placed in the global Python `CompilationUnit`.

 The rules for mangling are simple. Say we have a qualified name `__torch__.foo.Bar`:

 ```
 __torch__.foo.Bar                    # first time, unchanged
 __torch__.foo.__torch_mangle_0.Bar   # second time, when we request a mangle
 __torch__.foo.__torch_mangle_1.Bar   # and so on
 ```

 Notice that we mangle the namespace before `Bar`. This is so that when we
 pretty-print code, the unqualified name (`Bar`) is unchanged. This is a
 useful property so that things like trace-checking are oblivious to the
 mangling.
	# TorchScript serialization

	This document explains the TorchScript serialization format, and the anatomy
	of a call to `torch::jit::save()` or `torch::jit::load()`.

	<!-- toc -->

	- [Overview](#overview)
	- [Design Notes](#design-notes)
	- [`code/`: How code is serialized](#code-how-code-is-serialized)
	- [Printing code objects as Python source](#printing-code-objects-as-python-source)
	- [Placing the source code in the archive](#placing-the-source-code-in-the-archive)
	- [How data is serialized](#how-data-is-serialized)
	- [`data.pkl`: How module object state is serialized](#datapkl-how-module-object-state-is-serialized)
	- [`data/`: How tensors are serialized](#data-how-tensors-are-serialized)
	- [`constants.pkl`: Constants in code](#constantspkl-constants-in-code)
	- [`torch:jit::load()`](#torchjitload)
	- [`__getstate__` and `__setstate__`](#__getstate__-and-__setstate__)
	- [Appendix: `CompilationUnit` and code object ownership](#appendix-compilationunit-and-code-object-ownership)
	- [`CompilationUnit` ownership semantics](#compilationunit-ownership-semantics)
	- [Code object naming](#code-object-naming)

	<!-- tocstop -->

	## Overview

	A serialized model (call it `model.pt`) is a ZIP archive containing many
	files. If you want to manually crack it open, you can call `unzip` on it to
	inspect the file structure directly:

	```
	$ unzip model.pt
	Archive: model.pt
	extracting ...

	$ tree model/
	├── code/
	│ ├── __torch__.py
	│ ├── __torch__.py.debug_pkl
	│ ├── foo/
	│ │ ├── bar.py
	│ │ ├── bar.py.debug_pkl
	├── data.pkl
	├── constants.pkl
	└── data/
	├── 0
	└── 1
	```

	You'll notice that there are `.py` and `.pkl` files in this archive. That's
	because our serialization format tries to mimic Python's. All "code-like"
	information (methods, modules, classes, functions) are stored as
	human-readable `.py` containing valid Python syntax, and all "data-like"
	information (attributes, objects, etc.) are pickled using a subset of
	Python's pickle protocol.

	A model is really a top-level module with some submodules, parameters, and so
	on depending on what the author needs. So, `data.pkl` contains the pickled
	top-level module. Deserializing the model is as simple as calling
	`unpickle()` on `data.pkl`, which will restore the module object state and
	load its associated code on demand.

	### Design Notes

	Some things to keep in mind while working on the serialization code. These
	may help make technical decisions on which approach to take when making a
	change.

	Do what Python does. When it comes to the serialized format, it's much
	simpler in the long-run to be consistent with whatever Python does. A good
	rule of thumb is: if I tried to interact with serialized artifacts using
	Python, would it work? i.e., all serialized code should be valid Python, and
	all pickled objects should be depickle-able by Python.

	Being consistent with Python means our format is more debuggable (you can
	always crack it open and poke at it from Python) and leads to fewer surprises
	for developers familiar with Python but not familiar with TorchScript.

	Human readable. In addition to being valid Python, serialized code should
	attempt to be readable Python. We should try to preserve the variable names
	that authors wrote, appropriately inline short expressions, and so on. This
	helps with debugging the serialized code.

	No jitter. If we do:

	```
	m = MyModule()
	m.save("foo.pt")
	m_loaded = torch.load("foo.pt")
	m_loaded.save("foo2.pt")
	m_loaded2 = torch.load("foo2.pt")
	```

	We want the property that `m_loaded` and `m_loaded2` are identical. This
	"no-jitter" property is useful in catching bugs in the serialization process,
	and generally is desirable for debugging (models won't drift depending on how
	many times you saved/loaded them).

	Initial load should be fast. Calling `load()` should be effectively
	instantaneous to a human. Anything that takes a long time (reading in tensor
	data, for example) should be done lazily.

	## `code/`: How code is serialized

	At a high level, code serialization means:

	1. Transforming `ClassType`s and `Function`s (called "code objects") into Python source code.
	2. Placing the source code in the model ZIP archive.

	### Printing code objects as Python source
	`PythonPrint` is the function that takes as input a `ClassType` or `Function`
	("code object") and outputs Python source code. `ScriptModule`s are
	implemented as class types, so their methods and attributes will get
	serialized as well.

	`PythonPrint` works by walking a `Graph` (the IR representation of either a
	`ClassType`'s method or raw `Function`) and emitting Python code that
	corresponds to it. The rules for emitting Python code are mostly
	straightforward and uninteresting. There are some extra pieces of information
	that `PythonPrint` tracks, however:

	Class dependencies. While walking the graph, `PythonPrint` keeps track of
	what classes are used in the graph and adds them to a list of classes that
	the current code object depends on. For example, if we are printing a
	`Module`, it will depend on its submodules, as well as any classes used in
	its methods or attributes.

	Uses of tensor constants. Most constants are inlined as literals, like
	strings or ints. But since tensors are potentially very large, when
	`PythonPrint` encounters a constant tensor it will emit a reference to a
	global `CONSTANTS` table (like `foo = CONSTANTS.c0`).

	When importing, the importer will know how to resolve this reference into an
	actual tensor by looking it up in the tensor table. So `CONSTANTS.c0` means
	"this is the `0th` tensor in the tensor tuple in `constants.pkl`." See
	[the constants section](#constantspkl-constants-in-code) for more info.

	Original source range records. To aid debugging, `PythonPrint` remembers
	the "original" (user-written) location of the source code it's emitting. That
	way, when the user is debugging a model they loaded, they will see
	diagnostics that point to the code that they actually wrote, rather than the
	code that `PythonPrint` emitted.

	The original source range records are pickled and saved in a corresponding
	`.debug_pkl` file with the same name as the code. You can think of this
	`.debug_pkl` file as a map between source ranges in the serialized code and
	the original user-written code.

	Module information. Modules are special in a few ways. First are
	`Parameter`s: some module attributes are actually `Parameter`s, which have
	special properties (see [the `torch.nn`
	documentation](https://pytorch.org/docs/stable/nn.html#parameters) for exact
	details). We track which attributes are parameters by emitting a special
	assignment in the class body, like:

	```
	class MyModule(Module):
	__parameters__ = ["foo", "bar", ]
	foo : Tensor
	bar : Tensor
	attribute_but_not_param : Tensor
	```

	Another special thing with modules is that they are typically constructed in
	Python, and we do not compile the `__init__()` method. So in order to ensure
	they are statically typed, `PythonPrint` must enumerate a module's attributes
	(as you can see above), because it can't rely on compiling `__init__()` to
	infer the attributes.

	A final special thing is that some modules (like `nn.Sequential`) have
	attributes that are not valid Python identifiers. We can't write

	```
	# wrong!
	class MyModule(Module):
	0 : ASubmodule
	1 : BSubmodule
	```

	because this is not valid Python syntax (even though it is legal in Python to
	have attributes with those names!). So we use a trick where we write directly
	to the `__annotations__` dict:

	```
	class MyModule(Module):
	__annotations__ = []
	__annotations__["0"] = ASubmodule
	__annotations__["1"] = ASubmodule
	```

	### Placing the source code in the archive

	Once all code objects have been `PythonPrint`ed into source strings, we have
	to figure out where to actually put this source. Explaining this necessitates
	an introduction to `CompilationUnit` and `QualifiedName`. See the appendix on
	`CompilationUnit` for more info.

	`CompilationUnit`: this is the owning container for all code objects
	associated with a given model. When we load, we load all the code objects to
	a single `CompilationUnit`.

	`QualifiedName`: this is the fully qualified name for a code object. It is
	similar to qualified names in Python, and looks like `"foo.bar.baz"`. Each
	code object has a unique `QualifiedName` within a `CompilationUnit`.

	The exporter uses the `QualifiedName` of a code object to determine its
	location in the `code/` folder. The way it does so is similar to how Python
	does it; for example, the class `Baz` with a `QualifiedName` `"foo.bar.Baz"`
	will be placed in `code/foo/bar.py` under the name `Baz`.

	Classes at the root of the hierarchy are given the qualified name `__torch__`
	as a prefix, just so that they can go in `__torch__.py`. (Why not `__main__`?
	Because pickle has weird special rules about things that live in `__main__`).

	That's about it; there's some additional logic to make sure that within a
	file, we place the classes in reverse-dependency order so that we compile the
	"leaf" dependencies before things that depend on them.

	## How data is serialized

	A model is really a top-level `ScriptModule` with any number of submodules,
	parameters, attributes, and so on. We implement a subset of the Pickle format
	necessary for pickling a module object.

	`pickle`'s format was chosen due to:

	* user friendliness - the attributes file can be loaded in Python with `pickle`
	* size limits - formats such as Protobuf empose size limits on total
	message size, whereas pickle limits are on individual values (e.g. strings
	cannot be longer than 4 GB)
	* standard format - `pickle` is a standard Python module with a reasonably
	simple format. The format is a program to be consumed by a stack machine that
	is detailed in Python's
	* [`pickletools.py`](https://svn.python.org/projects/python/trunk/Lib/pickletools.py)
	* built-in memoization - for shared reference types (e.g. Tensor, string,
	lists, dicts)
	* self describing - a separate definition file is not needed to understand
	the pickled data
	* eager mode save - `torch.save()` already produces a `pickle` archive, so
	doing the same with attributes avoids introducing yet another format

	### `data.pkl`: How module object state is serialized

	All data is written into the `data.pkl` file with the exception of tensors
	(see [the tensor section](#data-how-tensors-are-serialized) below).
	"Data" means all parts of the module object state, like attributes,
	submodules, etc.

	PyTorch functions defined in [torch/jit/_pickle.py](../../../jit/_pickle.py)
	are used to mark special data types, such as this tensor table index or
	specialized lists.

	### `data/`: How tensors are serialized

	During export a list of all the tensors in a model is created. Tensors can
	come from either module parameters or attributes of Tensor type.

	Tensors are treated differently from other data (which is pickled using the
	standard pickling process) for a few reasons:

	- Tensors regularly exceed the `pickle` file size limit.
	- We'd like to be able to `mmap` Tensors directly.
	- We'd like to maintain compatibility with regular `PyTorch`'s serialization
	format

	## `constants.pkl`: Constants in code

	The `pickle` format enforces a separation between data and code, which the
	TorchScript serialization process represents by having `code/` and
	`data.pkl + tensors/`.

	However, TorchScript inlines constants (i.e. `prim::Constant` nodes) directly
	into `code/`. This poses a problem for tensor constants, which are not easily
	representable in string form.

	We can't put tensor constants in `data.pkl`, because the source code must be
	loaded before `data.pkl`, and so putting the tensor constants there would
	create a cyclic loading dependency.

	We solve this problem by creating a separate `pickle` file called
	`constants.pkl`, which holds all tensor constants referenced in code. The
	load order will be explained in the next section.

	## `torch:jit::load()`

	The load process has the following steps:

	1. Unpickle `constants.pkl`, which produces a tuple of all tensor constants
	referenced in code.
	2. Unpickle `data.pkl` into the top-level `Module` and return it.

	The unpickling process consists of a single call to unpickle the module
	object contained in `data.pkl`. The `Unpickler` is given a callback that lets it
	resolved any qualified names it encounters into `ClassType`s. This is done by
	resolving the qualified name to the appropriate file in `code/`, then
	compiling that file and returning the appropriate `ClassType`.

	This is why it's important to give code objects unique qualified names in the
	`CompilationUnit`. That way, every class that `Unpickler` encounters has a
	deterministic location in `code/` where it is stored.

	`Unpickler` is also responsible for resolving references to tensors into
	actual `at::Tensor`s. This is done by looking up offsets in the tensor table
	during the unpickling process, (soon to be replaced with the same pickling
	strategy as all other data).

	## `__getstate__` and `__setstate__`

	Like in Python's `pickle`, users can customize the pickling behavior of their
	class or module by implementing `__getstate__()` and `__setstate__()`
	methods. For basic usage, refer to the relevant [Python
	docs](https://docs.python.org/3.7/library/pickle.html#pickle-state).

	Calls to `__getstate__` and `__setstate__` are handled transparently by
	`Pickler` and `Unpickler`, so the serialization process shouldn't worry about
	it too much.

	One thing worth calling out is that the compiler implements a few special
	type inference behaviors to cheat the fact that users currently cannot type
	annotate `Module`s.

	`__getstate__` and `__setstate__` do not require type annotations. For
	`__getstate__`, the compiler can fully infer the return based on what
	attributes the user is returning. Then, `__setstate__` simply looks up the
	return type of `__getstate__` and uses that as its input type.

	For example:

	```
	class M(torch.nn.Module):
	def __init__(self):
	self.a = torch.rand(2, 3)
	self.b = torch.nn.Linear(10, 10)

	def __getstate__(self):
	# Compiler infers that this is a tuple of (Tensor, Linear)
	return (self.a, self.b)

	def __setstate__(self, state):
	# Don't need to annotate this, we know what type `state` is!
	self.a = state[0]
	self.b = state[1]
	```

	## Appendix: `CompilationUnit` and code object ownership
	`CompilationUnit` performs two functions:

	1. It is the owner (in a C++ sense) for all code objects.
	2. It forms a namespace in which code objects must have unique names.

	A `CompilationUnit` is created whenever `torch::jit::load()` is invoked, to
	place the newly deserialized code objects in. In Python, there is a single
	global `CompilationUnit` that holds all code objects defined in Python.

	### `CompilationUnit` ownership semantics
	There are a few different entities that participate in the ownership model:
	`CompilationUnit`: A container that owns code objects and gives them name.
	Every code object has a unique qualified name within the CompilationUnit.

	There are two kinds of code objects: `Function`s and `ClassType`s.
	`Function`: A `Graph` with an associated executor. The `Graph` may own
	`ClassType`s, since some `Value`s hold a `shared_ptr` to their type (for
	now). The `Graph` may also weakly reference other `Function`s through
	function calls.

	`ClassType`: A definition of a type. This could refer to a user-defined
	TorchScript class, or a `ScriptModule`. Owns other its attribute types
	(including other ClassTypes). Weakly references the class’s methods
	(`Function`s).

	`Object`: An instance of a particular class. Own the `CompilationUnit`
	that owns its `ClassType`. This is to ensure that if the user passes the
	object around in C++, all its code will stay around and methods will be
	invokable.

	`Module`: A view over a `ClassType` and the `Object` that holds its state.
	Also responsible for turning unqualified names (e.g. `forward()`) into
	qualified ones for lookup in the owning `CompilationUnit` (e.g.
	`__torch__.MyModule.forward`). Owns the `Object`, which transitively owns the
	`CompilationUnit`.

	`Method`: A tuple of `(Module, Function)`.

	### Code object naming

	`CompilationUnit` maintains a namespace in which all code objects
	(`ClassType`s and `Function`s) are uniquely named. These names don't have any
	particular meaning, except that they uniquely identify a code object during
	serialization and deserialization. The basic naming scheme is:

	* Everything starts in the `__torch__` namespace.
	* Classes are named parallel to Python’s module namespacing: so class `Bar` in
	`foo.py` would become `__torch__.foo.Bar`.
	* Methods are attached to the module’s namespace. So `Bar.forward()` would be
	`__torch__.foo.Bar.forward`.

	There are some caveats:

	Some `CompilationUnit`s have no prefix: For testing and other internal
	purposes, occasionally it’s useful to have no prefixes on names. In this
	case, everything is just a bare name inside the `CompilationUnit`. Users
	cannot construct `CompilationUnits that look like this.

	Name mangling: In Python, we can construct code objects that have the same
	qualified name. There are two cases where this happens:

	1. For `ScriptModule`s, since every `ScriptModule` is a singleton class in
	the JIT, a user that is constructing multiple `ScriptModule`s will create
	multiple corresponding `ClassType`s with identical names.
	2. Nesting functions will also cause qualified name clashes, due to
	limitations in Python. In these cases, we mangle the names of the code
	objects before they are placed in the global Python `CompilationUnit`.

	The rules for mangling are simple. Say we have a qualified name `__torch__.foo.Bar`:

	```
	__torch__.foo.Bar # first time, unchanged
	__torch__.foo.__torch_mangle_0.Bar # second time, when we request a mangle
	__torch__.foo.__torch_mangle_1.Bar # and so on
	```

	Notice that we mangle the namespace before `Bar`. This is so that when we
	pretty-print code, the unqualified name (`Bar`) is unchanged. This is a
	useful property so that things like trace-checking are oblivious to the
	mangling.