docs/source/meta.rst - platform/external/pytorch - Git at Google

 Meta device
 ============

 The "meta" device is an abstract device which denotes a tensor which records
 only metadata, but no actual data.  Meta tensors have two primary use cases:

 * Models can be loaded on the meta device, allowing you to load a
   representation of the model without actually loading the actual parameters
   into memory.  This can be helpful if you need to make transformations on
   the model before you load the actual data.

 * Most operations can be performed on meta tensors, producing new meta
   tensors that describe what the result would have been if you performed
   the operation on a real tensor.  You can use this to perform abstract
   analysis without needing to spend time on compute or space to represent
   the actual tensors.  Because meta tensors do not have real data, you cannot
   perform data-dependent operations like :func:`torch.nonzero` or
   :meth:`~torch.Tensor.item`.  In some cases, not all device types (e.g., CPU
   and CUDA) have exactly the same output metadata for an operation; we
   typically prefer representing the CUDA behavior faithfully in this
   situation.

 .. warning::

     Although in principle meta tensor computation should always be faster than
     an equivalent CPU/CUDA computation, many meta tensor implementations are
     implemented in Python and have not been ported to C++ for speed, so you
     may find that you get lower absolute framework latency with small CPU tensors.

 Idioms for working with meta tensors
 ------------------------------------

 An object can be loaded with :func:`torch.load` onto meta device by specifying
 ``map_location='meta'``::

     >>> torch.save(torch.randn(2), 'foo.pt')
     >>> torch.load('foo.pt', map_location='meta')
     tensor(..., device='meta', size=(2,))

 If you have some arbitrary code which performs some tensor construction without
 explicitly specifying a device, you can override it to instead construct on meta device by using
 the :func:`torch.device` context manager::

     >>> with torch.device('meta'):
     ...     print(torch.randn(30, 30))
     ...
     tensor(..., device='meta', size=(30, 30))

 This is especially helpful NN module construction, where you often are not
 able to explicitly pass in a device for initialization::

     >>> from torch.nn.modules import Linear
     >>> with torch.device('meta'):
     ...     print(Linear(20, 30))
     ...
     Linear(in_features=20, out_features=30, bias=True)

 You cannot convert a meta tensor directly to a CPU/CUDA tensor, because the
 meta tensor stores no data and we do not know what the correct data values for
 your new tensor are::

     >>> torch.ones(5, device='meta').to("cpu")
     Traceback (most recent call last):
       File "<stdin>", line 1, in <module>
     NotImplementedError: Cannot copy out of meta tensor; no data!

 Use a factory function like :func:`torch.empty_like` to explicitly specify how
 you would like the missing data to be filled in.

 NN modules have a convenience method :meth:`torch.nn.Module.to_empty` that
 allow you to the module to another device, leaving all parameters
 uninitialized.  You are expected to explicitly reinitialize the parameters
 manually::

     >>> from torch.nn.modules import Linear
     >>> with torch.device('meta'):
     ...     m = Linear(20, 30)
     >>> m.to_empty(device="cpu")
     Linear(in_features=20, out_features=30, bias=True)

 :mod:`torch._subclasses.meta_utils` contains undocumented utilities for taking
 an arbitrary Tensor and constructing an equivalent meta Tensor with high
 fidelity.  These APIs are experimental and may be changed in a BC breaking way
 at any time.
	Meta device
	============

	The "meta" device is an abstract device which denotes a tensor which records
	only metadata, but no actual data. Meta tensors have two primary use cases:

	* Models can be loaded on the meta device, allowing you to load a
	representation of the model without actually loading the actual parameters
	into memory. This can be helpful if you need to make transformations on
	the model before you load the actual data.

	* Most operations can be performed on meta tensors, producing new meta
	tensors that describe what the result would have been if you performed
	the operation on a real tensor. You can use this to perform abstract
	analysis without needing to spend time on compute or space to represent
	the actual tensors. Because meta tensors do not have real data, you cannot
	perform data-dependent operations like :func:`torch.nonzero` or
	:meth:`~torch.Tensor.item`. In some cases, not all device types (e.g., CPU
	and CUDA) have exactly the same output metadata for an operation; we
	typically prefer representing the CUDA behavior faithfully in this
	situation.

	.. warning::

	Although in principle meta tensor computation should always be faster than
	an equivalent CPU/CUDA computation, many meta tensor implementations are
	implemented in Python and have not been ported to C++ for speed, so you
	may find that you get lower absolute framework latency with small CPU tensors.

	Idioms for working with meta tensors
	------------------------------------

	An object can be loaded with :func:`torch.load` onto meta device by specifying
	``map_location='meta'``::

	>>> torch.save(torch.randn(2), 'foo.pt')
	>>> torch.load('foo.pt', map_location='meta')
	tensor(..., device='meta', size=(2,))

	If you have some arbitrary code which performs some tensor construction without
	explicitly specifying a device, you can override it to instead construct on meta device by using
	the :func:`torch.device` context manager::

	>>> with torch.device('meta'):
	... print(torch.randn(30, 30))
	...
	tensor(..., device='meta', size=(30, 30))

	This is especially helpful NN module construction, where you often are not
	able to explicitly pass in a device for initialization::

	>>> from torch.nn.modules import Linear
	>>> with torch.device('meta'):
	... print(Linear(20, 30))
	...
	Linear(in_features=20, out_features=30, bias=True)

	You cannot convert a meta tensor directly to a CPU/CUDA tensor, because the
	meta tensor stores no data and we do not know what the correct data values for
	your new tensor are::

	>>> torch.ones(5, device='meta').to("cpu")
	Traceback (most recent call last):
	File "<stdin>", line 1, in <module>
	NotImplementedError: Cannot copy out of meta tensor; no data!

	Use a factory function like :func:`torch.empty_like` to explicitly specify how
	you would like the missing data to be filled in.

	NN modules have a convenience method :meth:`torch.nn.Module.to_empty` that
	allow you to the module to another device, leaving all parameters
	uninitialized. You are expected to explicitly reinitialize the parameters
	manually::

	>>> from torch.nn.modules import Linear
	>>> with torch.device('meta'):
	... m = Linear(20, 30)
	>>> m.to_empty(device="cpu")
	Linear(in_features=20, out_features=30, bias=True)

	:mod:`torch._subclasses.meta_utils` contains undocumented utilities for taking
	an arbitrary Tensor and constructing an equivalent meta Tensor with high
	fidelity. These APIs are experimental and may be changed in a BC breaking way
	at any time.