docs/source/data.rst - platform/external/pytorch - Git at Google

 torch.utils.data
 ===================================

 .. automodule:: torch.utils.data

 At the heart of PyTorch data loading utility is the :class:`torch.utils.data.DataLoader`
 class.  It represents a Python iterable over a dataset, with support for

 * `map-style and iterable-style datasets <Dataset Types_>`_,

 * `customizing data loading order <Data Loading Order and Sampler_>`_,

 * `automatic batching <Loading Batched and Non-Batched Data_>`_,

 * `single- and multi-process data loading <Single- and Multi-process Data Loading_>`_,

 * `automatic memory pinning <Memory Pinning_>`_.

 These options are configured by the constructor arguments of a
 :class:`~torch.utils.data.DataLoader`, which has signature::

     DataLoader(dataset, batch_size=1, shuffle=False, sampler=None,
                batch_sampler=None, num_workers=0, collate_fn=None,
                pin_memory=False, drop_last=False, timeout=0,
                worker_init_fn=None, *, prefetch_factor=2,
                persistent_workers=False)

 The sections below describe in details the effects and usages of these options.

 Dataset Types
 -------------

 The most important argument of :class:`~torch.utils.data.DataLoader`
 constructor is :attr:`dataset`, which indicates a dataset object to load data
 from. PyTorch supports two different types of datasets:

 * `map-style datasets <Map-style datasets_>`_,

 * `iterable-style datasets <Iterable-style datasets_>`_.

 Map-style datasets
 ^^^^^^^^^^^^^^^^^^

 A map-style dataset is one that implements the :meth:`__getitem__` and
 :meth:`__len__` protocols, and represents a map from (possibly non-integral)
 indices/keys to data samples.

 For example, such a dataset, when accessed with ``dataset[idx]``, could read
 the ``idx``-th image and its corresponding label from a folder on the disk.

 See :class:`~torch.utils.data.Dataset` for more details.

 Iterable-style datasets
 ^^^^^^^^^^^^^^^^^^^^^^^

 An iterable-style dataset is an instance of a subclass of :class:`~torch.utils.data.IterableDataset`
 that implements the :meth:`__iter__` protocol, and represents an iterable over
 data samples. This type of datasets is particularly suitable for cases where
 random reads are expensive or even improbable, and where the batch size depends
 on the fetched data.

 For example, such a dataset, when called ``iter(dataset)``, could return a
 stream of data reading from a database, a remote server, or even logs generated
 in real time.

 See :class:`~torch.utils.data.IterableDataset` for more details.

 .. note:: When using a :class:`~torch.utils.data.IterableDataset` with
           `multi-process data loading <Multi-process data loading_>`_. The same
           dataset object is replicated on each worker process, and thus the
           replicas must be configured differently to avoid duplicated data. See
           :class:`~torch.utils.data.IterableDataset` documentations for how to
           achieve this.

 Data Loading Order and :class:`~torch.utils.data.Sampler`
 ---------------------------------------------------------

 For `iterable-style datasets <Iterable-style datasets_>`_, data loading order
 is entirely controlled by the user-defined iterable. This allows easier
 implementations of chunk-reading and dynamic batch size (e.g., by yielding a
 batched sample at each time).

 The rest of this section concerns the case with
 `map-style datasets <Map-style datasets_>`_. :class:`torch.utils.data.Sampler`
 classes are used to specify the sequence of indices/keys used in data loading.
 They represent iterable objects over the indices to datasets.  E.g., in the
 common case with stochastic gradient decent (SGD), a
 :class:`~torch.utils.data.Sampler` could randomly permute a list of indices
 and yield each one at a time, or yield a small number of them for mini-batch
 SGD.

 A sequential or shuffled sampler will be automatically constructed based on the :attr:`shuffle` argument to a :class:`~torch.utils.data.DataLoader`.
 Alternatively, users may use the :attr:`sampler` argument to specify a
 custom :class:`~torch.utils.data.Sampler` object that at each time yields
 the next index/key to fetch.

 A custom :class:`~torch.utils.data.Sampler` that yields a list of batch
 indices at a time can be passed as the :attr:`batch_sampler` argument.
 Automatic batching can also be enabled via :attr:`batch_size` and
 :attr:`drop_last` arguments. See
 `the next section <Loading Batched and Non-Batched Data_>`_ for more details
 on this.

 .. note::
   Neither :attr:`sampler` nor :attr:`batch_sampler` is compatible with
   iterable-style datasets, since such datasets have no notion of a key or an
   index.

 Loading Batched and Non-Batched Data
 ------------------------------------

 :class:`~torch.utils.data.DataLoader` supports automatically collating
 individual fetched data samples into batches via arguments
 :attr:`batch_size`, :attr:`drop_last`, :attr:`batch_sampler`, and
 :attr:`collate_fn` (which has a default function).


 Automatic batching (default)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^

 This is the most common case, and corresponds to fetching a minibatch of
 data and collating them into batched samples, i.e., containing Tensors with
 one dimension being the batch dimension (usually the first).

 When :attr:`batch_size` (default ``1``) is not ``None``, the data loader yields
 batched samples instead of individual samples. :attr:`batch_size` and
 :attr:`drop_last` arguments are used to specify how the data loader obtains
 batches of dataset keys. For map-style datasets, users can alternatively
 specify :attr:`batch_sampler`, which yields a list of keys at a time.

 .. note::
   The :attr:`batch_size` and :attr:`drop_last` arguments essentially are used
   to construct a :attr:`batch_sampler` from :attr:`sampler`. For map-style
   datasets, the :attr:`sampler` is either provided by user or constructed
   based on the :attr:`shuffle` argument. For iterable-style datasets, the
   :attr:`sampler` is a dummy infinite one. See
   `this section <Data Loading Order and Sampler_>`_ on more details on
   samplers.

 .. note::
   When fetching from
   `iterable-style datasets <Iterable-style datasets_>`_ with
   `multi-processing <Multi-process data loading_>`_, the :attr:`drop_last`
   argument drops the last non-full batch of each worker's dataset replica.

 After fetching a list of samples using the indices from sampler, the function
 passed as the :attr:`collate_fn` argument is used to collate lists of samples
 into batches.

 In this case, loading from a map-style dataset is roughly equivalent with::

     for indices in batch_sampler:
         yield collate_fn([dataset[i] for i in indices])

 and loading from an iterable-style dataset is roughly equivalent with::

     dataset_iter = iter(dataset)
     for indices in batch_sampler:
         yield collate_fn([next(dataset_iter) for _ in indices])

 A custom :attr:`collate_fn` can be used to customize collation, e.g., padding
 sequential data to max length of a batch. See
 `this section <dataloader-collate_fn_>`_ on more about :attr:`collate_fn`.

 Disable automatic batching
 ^^^^^^^^^^^^^^^^^^^^^^^^^^

 In certain cases, users may want to handle batching manually in dataset code,
 or simply load individual samples. For example, it could be cheaper to directly
 load batched data (e.g., bulk reads from a database or reading continuous
 chunks of memory), or the batch size is data dependent, or the program is
 designed to work on individual samples.  Under these scenarios, it's likely
 better to not use automatic batching (where :attr:`collate_fn` is used to
 collate the samples), but let the data loader directly return each member of
 the :attr:`dataset` object.

 When both :attr:`batch_size` and :attr:`batch_sampler` are ``None`` (default
 value for :attr:`batch_sampler` is already ``None``), automatic batching is
 disabled. Each sample obtained from the :attr:`dataset` is processed with the
 function passed as the :attr:`collate_fn` argument.

 **When automatic batching is disabled**, the default :attr:`collate_fn` simply
 converts NumPy arrays into PyTorch Tensors, and keeps everything else untouched.

 In this case, loading from a map-style dataset is roughly equivalent with::

     for index in sampler:
         yield collate_fn(dataset[index])

 and loading from an iterable-style dataset is roughly equivalent with::

     for data in iter(dataset):
         yield collate_fn(data)

 See `this section <dataloader-collate_fn_>`_ on more about :attr:`collate_fn`.

 .. _dataloader-collate_fn:

 Working with :attr:`collate_fn`
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

 The use of :attr:`collate_fn` is slightly different when automatic batching is
 enabled or disabled.

 **When automatic batching is disabled**, :attr:`collate_fn` is called with
 each individual data sample, and the output is yielded from the data loader
 iterator. In this case, the default :attr:`collate_fn` simply converts NumPy
 arrays in PyTorch tensors.

 **When automatic batching is enabled**, :attr:`collate_fn` is called with a list
 of data samples at each time. It is expected to collate the input samples into
 a batch for yielding from the data loader iterator. The rest of this section
 describes the behavior of the default :attr:`collate_fn`
 (:func:`~torch.utils.data.default_collate`).

 For instance, if each data sample consists of a 3-channel image and an integral
 class label, i.e., each element of the dataset returns a tuple
 ``(image, class_index)``, the default :attr:`collate_fn` collates a list of
 such tuples into a single tuple of a batched image tensor and a batched class
 label Tensor. In particular, the default :attr:`collate_fn` has the following
 properties:

 * It always prepends a new dimension as the batch dimension.

 * It automatically converts NumPy arrays and Python numerical values into
   PyTorch Tensors.

 * It preserves the data structure, e.g., if each sample is a dictionary, it
   outputs a dictionary with the same set of keys but batched Tensors as values
   (or lists if the values can not be converted into Tensors). Same
   for ``list`` s, ``tuple`` s, ``namedtuple`` s, etc.

 Users may use customized :attr:`collate_fn` to achieve custom batching, e.g.,
 collating along a dimension other than the first, padding sequences of
 various lengths, or adding support for custom data types.

 If you run into a situation where the outputs of :class:`~torch.utils.data.DataLoader`
 have dimensions or type that is different from your expectation, you may
 want to check your :attr:`collate_fn`.

 Single- and Multi-process Data Loading
 --------------------------------------

 A :class:`~torch.utils.data.DataLoader` uses single-process data loading by
 default.

 Within a Python process, the
 `Global Interpreter Lock (GIL) <https://wiki.python.org/moin/GlobalInterpreterLock>`_
 prevents true fully parallelizing Python code across threads. To avoid blocking
 computation code with data loading, PyTorch provides an easy switch to perform
 multi-process data loading by simply setting the argument :attr:`num_workers`
 to a positive integer.

 Single-process data loading (default)
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

 In this mode, data fetching is done in the same process a
 :class:`~torch.utils.data.DataLoader` is initialized.  Therefore, data loading
 may block computing.  However, this mode may be preferred when resource(s) used
 for sharing data among processes (e.g., shared memory, file descriptors) is
 limited, or when the entire dataset is small and can be loaded entirely in
 memory.  Additionally, single-process loading often shows more readable error
 traces and thus is useful for debugging.


 Multi-process data loading
 ^^^^^^^^^^^^^^^^^^^^^^^^^^

 Setting the argument :attr:`num_workers` as a positive integer will
 turn on multi-process data loading with the specified number of loader worker
 processes.

 .. warning::
    After several iterations, the loader worker processes will consume
    the same amount of CPU memory as the parent process for all Python
    objects in the parent process which are accessed from the worker
    processes.  This can be problematic if the Dataset contains a lot of
    data (e.g., you are loading a very large list of filenames at Dataset
    construction time) and/or you are using a lot of workers (overall
    memory usage is ``number of workers * size of parent process``).  The
    simplest workaround is to replace Python objects with non-refcounted
    representations such as Pandas, Numpy or PyArrow objects.  Check out
    `issue #13246
    <https://github.com/pytorch/pytorch/issues/13246#issuecomment-905703662>`_
    for more details on why this occurs and example code for how to
    workaround these problems.

 In this mode, each time an iterator of a :class:`~torch.utils.data.DataLoader`
 is created (e.g., when you call ``enumerate(dataloader)``), :attr:`num_workers`
 worker processes are created. At this point, the :attr:`dataset`,
 :attr:`collate_fn`, and :attr:`worker_init_fn` are passed to each
 worker, where they are used to initialize, and fetch data. This means that
 dataset access together with its  internal IO, transforms
 (including :attr:`collate_fn`) runs in the worker process.

 :func:`torch.utils.data.get_worker_info()` returns various useful information
 in a worker process (including the worker id, dataset replica, initial seed,
 etc.), and returns ``None`` in main process. Users may use this function in
 dataset code and/or :attr:`worker_init_fn` to individually configure each
 dataset replica, and to determine whether the code is running in a worker
 process. For example, this can be particularly helpful in sharding the dataset.

 For map-style datasets, the main process generates the indices using
 :attr:`sampler` and sends them to the workers. So any shuffle randomization is
 done in the main process which guides loading by assigning indices to load.

 For iterable-style datasets, since each worker process gets a replica of the
 :attr:`dataset` object, naive multi-process loading will often result in
 duplicated data. Using :func:`torch.utils.data.get_worker_info()` and/or
 :attr:`worker_init_fn`, users may configure each replica independently. (See
 :class:`~torch.utils.data.IterableDataset` documentations for how to achieve
 this. ) For similar reasons, in multi-process loading, the :attr:`drop_last`
 argument drops the last non-full batch of each worker's iterable-style dataset
 replica.

 Workers are shut down once the end of the iteration is reached, or when the
 iterator becomes garbage collected.

 .. warning::
   It is generally not recommended to return CUDA tensors in multi-process
   loading because of many subtleties in using CUDA and sharing CUDA tensors in
   multiprocessing (see :ref:`multiprocessing-cuda-note`). Instead, we recommend
   using `automatic memory pinning <Memory Pinning_>`_ (i.e., setting
   :attr:`pin_memory=True`), which enables fast data transfer to CUDA-enabled
   GPUs.

 Platform-specific behaviors
 """""""""""""""""""""""""""

 Since workers rely on Python :py:mod:`multiprocessing`, worker launch behavior is
 different on Windows compared to Unix.

 * On Unix, :func:`fork()` is the default :py:mod:`multiprocessing` start method.
   Using :func:`fork`, child workers typically can access the :attr:`dataset` and
   Python argument functions directly through the cloned address space.

 * On Windows or MacOS, :func:`spawn()` is the default :py:mod:`multiprocessing` start method.
   Using :func:`spawn()`, another interpreter is launched which runs your main script,
   followed by the internal worker function that receives the :attr:`dataset`,
   :attr:`collate_fn` and other arguments through :py:mod:`pickle` serialization.

 This separate serialization means that you should take two steps to ensure you
 are compatible with Windows while using multi-process data loading:

 - Wrap most of you main script's code within ``if __name__ == '__main__':`` block,
   to make sure it doesn't run again (most likely generating error) when each worker
   process is launched. You can place your dataset and :class:`~torch.utils.data.DataLoader`
   instance creation logic here, as it doesn't need to be re-executed in workers.

 - Make sure that any custom :attr:`collate_fn`, :attr:`worker_init_fn`
   or :attr:`dataset` code is declared as top level definitions, outside of the
   ``__main__`` check. This ensures that they are available in worker processes.
   (this is needed since functions are pickled as references only, not ``bytecode``.)

 .. _data-loading-randomness:

 Randomness in multi-process data loading
 """"""""""""""""""""""""""""""""""""""""""

 By default, each worker will have its PyTorch seed set to ``base_seed + worker_id``,
 where ``base_seed`` is a long generated by main process using its RNG (thereby,
 consuming a RNG state mandatorily) or a specified :attr:`generator`. However, seeds for other
 libraries may be duplicated upon initializing workers, causing each worker to return
 identical random numbers. (See :ref:`this section <dataloader-workers-random-seed>` in FAQ.).

 In :attr:`worker_init_fn`, you may access the PyTorch seed set for each worker
 with either :func:`torch.utils.data.get_worker_info().seed <torch.utils.data.get_worker_info>`
 or :func:`torch.initial_seed()`, and use it to seed other libraries before data
 loading.

 Memory Pinning
 --------------

 Host to GPU copies are much faster when they originate from pinned (page-locked)
 memory. See :ref:`cuda-memory-pinning` for more details on when and how to use
 pinned memory generally.

 For data loading, passing :attr:`pin_memory=True` to a
 :class:`~torch.utils.data.DataLoader` will automatically put the fetched data
 Tensors in pinned memory, and thus enables faster data transfer to CUDA-enabled
 GPUs.

 The default memory pinning logic only recognizes Tensors and maps and iterables
 containing Tensors.  By default, if the pinning logic sees a batch that is a
 custom type (which will occur if you have a :attr:`collate_fn` that returns a
 custom batch type), or if each element of your batch is a custom type, the
 pinning logic will not recognize them, and it will return that batch (or those
 elements) without pinning the memory.  To enable memory pinning for custom
 batch or data type(s), define a :meth:`pin_memory` method on your custom
 type(s).

 See the example below.

 Example::

     class SimpleCustomBatch:
         def __init__(self, data):
             transposed_data = list(zip(*data))
             self.inp = torch.stack(transposed_data[0], 0)
             self.tgt = torch.stack(transposed_data[1], 0)

         # custom memory pinning method on custom type
         def pin_memory(self):
             self.inp = self.inp.pin_memory()
             self.tgt = self.tgt.pin_memory()
             return self

     def collate_wrapper(batch):
         return SimpleCustomBatch(batch)

     inps = torch.arange(10 * 5, dtype=torch.float32).view(10, 5)
     tgts = torch.arange(10 * 5, dtype=torch.float32).view(10, 5)
     dataset = TensorDataset(inps, tgts)

     loader = DataLoader(dataset, batch_size=2, collate_fn=collate_wrapper,
                         pin_memory=True)

     for batch_ndx, sample in enumerate(loader):
         print(sample.inp.is_pinned())
         print(sample.tgt.is_pinned())


 .. autoclass:: DataLoader
 .. autoclass:: Dataset
 .. autoclass:: IterableDataset
 .. autoclass:: TensorDataset
 .. autoclass:: ConcatDataset
 .. autoclass:: ChainDataset
 .. autoclass:: Subset
 .. autofunction:: torch.utils.data._utils.collate.collate
 .. autofunction:: torch.utils.data.default_collate
 .. autofunction:: torch.utils.data.default_convert
 .. autofunction:: torch.utils.data.get_worker_info
 .. autofunction:: torch.utils.data.random_split
 .. autoclass:: torch.utils.data.Sampler
 .. autoclass:: torch.utils.data.SequentialSampler
 .. autoclass:: torch.utils.data.RandomSampler
 .. autoclass:: torch.utils.data.SubsetRandomSampler
 .. autoclass:: torch.utils.data.WeightedRandomSampler
 .. autoclass:: torch.utils.data.BatchSampler
 .. autoclass:: torch.utils.data.distributed.DistributedSampler


 .. These modules are documented as part of torch/data listing them here for
 .. now until we have a clearer fix
 .. py:module:: torch.utils.data.datapipes
 .. py:module:: torch.utils.data.datapipes.dataframe
 .. py:module:: torch.utils.data.datapipes.iter
 .. py:module:: torch.utils.data.datapipes.map
 .. py:module:: torch.utils.data.datapipes.utils
	torch.utils.data
	===================================

	.. automodule:: torch.utils.data

	At the heart of PyTorch data loading utility is the :class:`torch.utils.data.DataLoader`
	class. It represents a Python iterable over a dataset, with support for

	* `map-style and iterable-style datasets <Dataset Types_>`_,

	* `customizing data loading order <Data Loading Order and Sampler_>`_,

	* `automatic batching <Loading Batched and Non-Batched Data_>`_,

	* `single- and multi-process data loading <Single- and Multi-process Data Loading_>`_,

	* `automatic memory pinning <Memory Pinning_>`_.

	These options are configured by the constructor arguments of a
	:class:`~torch.utils.data.DataLoader`, which has signature::

	DataLoader(dataset, batch_size=1, shuffle=False, sampler=None,
	batch_sampler=None, num_workers=0, collate_fn=None,
	pin_memory=False, drop_last=False, timeout=0,
	worker_init_fn=None, *, prefetch_factor=2,
	persistent_workers=False)

	The sections below describe in details the effects and usages of these options.

	Dataset Types
	-------------

	The most important argument of :class:`~torch.utils.data.DataLoader`
	constructor is :attr:`dataset`, which indicates a dataset object to load data
	from. PyTorch supports two different types of datasets:

	* `map-style datasets <Map-style datasets_>`_,

	* `iterable-style datasets <Iterable-style datasets_>`_.

	Map-style datasets
	^^^^^^^^^^^^^^^^^^

	A map-style dataset is one that implements the :meth:`__getitem__` and
	:meth:`__len__` protocols, and represents a map from (possibly non-integral)
	indices/keys to data samples.

	For example, such a dataset, when accessed with ``dataset[idx]``, could read
	the ``idx``-th image and its corresponding label from a folder on the disk.

	See :class:`~torch.utils.data.Dataset` for more details.

	Iterable-style datasets
	^^^^^^^^^^^^^^^^^^^^^^^

	An iterable-style dataset is an instance of a subclass of :class:`~torch.utils.data.IterableDataset`
	that implements the :meth:`__iter__` protocol, and represents an iterable over
	data samples. This type of datasets is particularly suitable for cases where
	random reads are expensive or even improbable, and where the batch size depends
	on the fetched data.

	For example, such a dataset, when called ``iter(dataset)``, could return a
	stream of data reading from a database, a remote server, or even logs generated
	in real time.

	See :class:`~torch.utils.data.IterableDataset` for more details.

	.. note:: When using a :class:`~torch.utils.data.IterableDataset` with
	`multi-process data loading <Multi-process data loading_>`_. The same
	dataset object is replicated on each worker process, and thus the
	replicas must be configured differently to avoid duplicated data. See
	:class:`~torch.utils.data.IterableDataset` documentations for how to
	achieve this.

	Data Loading Order and :class:`~torch.utils.data.Sampler`
	---------------------------------------------------------

	For `iterable-style datasets <Iterable-style datasets_>`_, data loading order
	is entirely controlled by the user-defined iterable. This allows easier
	implementations of chunk-reading and dynamic batch size (e.g., by yielding a
	batched sample at each time).

	The rest of this section concerns the case with
	`map-style datasets <Map-style datasets_>`_. :class:`torch.utils.data.Sampler`
	classes are used to specify the sequence of indices/keys used in data loading.
	They represent iterable objects over the indices to datasets. E.g., in the
	common case with stochastic gradient decent (SGD), a
	:class:`~torch.utils.data.Sampler` could randomly permute a list of indices
	and yield each one at a time, or yield a small number of them for mini-batch
	SGD.

	A sequential or shuffled sampler will be automatically constructed based on the :attr:`shuffle` argument to a :class:`~torch.utils.data.DataLoader`.
	Alternatively, users may use the :attr:`sampler` argument to specify a
	custom :class:`~torch.utils.data.Sampler` object that at each time yields
	the next index/key to fetch.

	A custom :class:`~torch.utils.data.Sampler` that yields a list of batch
	indices at a time can be passed as the :attr:`batch_sampler` argument.
	Automatic batching can also be enabled via :attr:`batch_size` and
	:attr:`drop_last` arguments. See
	`the next section <Loading Batched and Non-Batched Data_>`_ for more details
	on this.

	.. note::
	Neither :attr:`sampler` nor :attr:`batch_sampler` is compatible with
	iterable-style datasets, since such datasets have no notion of a key or an
	index.

	Loading Batched and Non-Batched Data
	------------------------------------

	:class:`~torch.utils.data.DataLoader` supports automatically collating
	individual fetched data samples into batches via arguments
	:attr:`batch_size`, :attr:`drop_last`, :attr:`batch_sampler`, and
	:attr:`collate_fn` (which has a default function).


	Automatic batching (default)
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^

	This is the most common case, and corresponds to fetching a minibatch of
	data and collating them into batched samples, i.e., containing Tensors with
	one dimension being the batch dimension (usually the first).

	When :attr:`batch_size` (default ``1``) is not ``None``, the data loader yields
	batched samples instead of individual samples. :attr:`batch_size` and
	:attr:`drop_last` arguments are used to specify how the data loader obtains
	batches of dataset keys. For map-style datasets, users can alternatively
	specify :attr:`batch_sampler`, which yields a list of keys at a time.

	.. note::
	The :attr:`batch_size` and :attr:`drop_last` arguments essentially are used
	to construct a :attr:`batch_sampler` from :attr:`sampler`. For map-style
	datasets, the :attr:`sampler` is either provided by user or constructed
	based on the :attr:`shuffle` argument. For iterable-style datasets, the
	:attr:`sampler` is a dummy infinite one. See
	`this section <Data Loading Order and Sampler_>`_ on more details on
	samplers.

	.. note::
	When fetching from
	`iterable-style datasets <Iterable-style datasets_>`_ with
	`multi-processing <Multi-process data loading_>`_, the :attr:`drop_last`
	argument drops the last non-full batch of each worker's dataset replica.

	After fetching a list of samples using the indices from sampler, the function
	passed as the :attr:`collate_fn` argument is used to collate lists of samples
	into batches.

	In this case, loading from a map-style dataset is roughly equivalent with::

	for indices in batch_sampler:
	yield collate_fn([dataset[i] for i in indices])

	and loading from an iterable-style dataset is roughly equivalent with::

	dataset_iter = iter(dataset)
	for indices in batch_sampler:
	yield collate_fn([next(dataset_iter) for _ in indices])

	A custom :attr:`collate_fn` can be used to customize collation, e.g., padding
	sequential data to max length of a batch. See
	`this section <dataloader-collate_fn_>`_ on more about :attr:`collate_fn`.

	Disable automatic batching
	^^^^^^^^^^^^^^^^^^^^^^^^^^

	In certain cases, users may want to handle batching manually in dataset code,
	or simply load individual samples. For example, it could be cheaper to directly
	load batched data (e.g., bulk reads from a database or reading continuous
	chunks of memory), or the batch size is data dependent, or the program is
	designed to work on individual samples. Under these scenarios, it's likely
	better to not use automatic batching (where :attr:`collate_fn` is used to
	collate the samples), but let the data loader directly return each member of
	the :attr:`dataset` object.

	When both :attr:`batch_size` and :attr:`batch_sampler` are ``None`` (default
	value for :attr:`batch_sampler` is already ``None``), automatic batching is
	disabled. Each sample obtained from the :attr:`dataset` is processed with the
	function passed as the :attr:`collate_fn` argument.

	When automatic batching is disabled, the default :attr:`collate_fn` simply
	converts NumPy arrays into PyTorch Tensors, and keeps everything else untouched.

	In this case, loading from a map-style dataset is roughly equivalent with::

	for index in sampler:
	yield collate_fn(dataset[index])

	and loading from an iterable-style dataset is roughly equivalent with::

	for data in iter(dataset):
	yield collate_fn(data)

	See `this section <dataloader-collate_fn_>`_ on more about :attr:`collate_fn`.

	.. _dataloader-collate_fn:

	Working with :attr:`collate_fn`
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

	The use of :attr:`collate_fn` is slightly different when automatic batching is
	enabled or disabled.

	When automatic batching is disabled, :attr:`collate_fn` is called with
	each individual data sample, and the output is yielded from the data loader
	iterator. In this case, the default :attr:`collate_fn` simply converts NumPy
	arrays in PyTorch tensors.

	When automatic batching is enabled, :attr:`collate_fn` is called with a list
	of data samples at each time. It is expected to collate the input samples into
	a batch for yielding from the data loader iterator. The rest of this section
	describes the behavior of the default :attr:`collate_fn`
	(:func:`~torch.utils.data.default_collate`).

	For instance, if each data sample consists of a 3-channel image and an integral
	class label, i.e., each element of the dataset returns a tuple
	``(image, class_index)``, the default :attr:`collate_fn` collates a list of
	such tuples into a single tuple of a batched image tensor and a batched class
	label Tensor. In particular, the default :attr:`collate_fn` has the following
	properties:

	* It always prepends a new dimension as the batch dimension.

	* It automatically converts NumPy arrays and Python numerical values into
	PyTorch Tensors.

	* It preserves the data structure, e.g., if each sample is a dictionary, it
	outputs a dictionary with the same set of keys but batched Tensors as values
	(or lists if the values can not be converted into Tensors). Same
	for ``list`` s, ``tuple`` s, ``namedtuple`` s, etc.

	Users may use customized :attr:`collate_fn` to achieve custom batching, e.g.,
	collating along a dimension other than the first, padding sequences of
	various lengths, or adding support for custom data types.

	If you run into a situation where the outputs of :class:`~torch.utils.data.DataLoader`
	have dimensions or type that is different from your expectation, you may
	want to check your :attr:`collate_fn`.

	Single- and Multi-process Data Loading
	--------------------------------------

	A :class:`~torch.utils.data.DataLoader` uses single-process data loading by
	default.

	Within a Python process, the
	`Global Interpreter Lock (GIL) <https://wiki.python.org/moin/GlobalInterpreterLock>`_
	prevents true fully parallelizing Python code across threads. To avoid blocking
	computation code with data loading, PyTorch provides an easy switch to perform
	multi-process data loading by simply setting the argument :attr:`num_workers`
	to a positive integer.

	Single-process data loading (default)
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

	In this mode, data fetching is done in the same process a
	:class:`~torch.utils.data.DataLoader` is initialized. Therefore, data loading
	may block computing. However, this mode may be preferred when resource(s) used
	for sharing data among processes (e.g., shared memory, file descriptors) is
	limited, or when the entire dataset is small and can be loaded entirely in
	memory. Additionally, single-process loading often shows more readable error
	traces and thus is useful for debugging.


	Multi-process data loading
	^^^^^^^^^^^^^^^^^^^^^^^^^^

	Setting the argument :attr:`num_workers` as a positive integer will
	turn on multi-process data loading with the specified number of loader worker
	processes.

	.. warning::
	After several iterations, the loader worker processes will consume
	the same amount of CPU memory as the parent process for all Python
	objects in the parent process which are accessed from the worker
	processes. This can be problematic if the Dataset contains a lot of
	data (e.g., you are loading a very large list of filenames at Dataset
	construction time) and/or you are using a lot of workers (overall
	memory usage is ``number of workers * size of parent process``). The
	simplest workaround is to replace Python objects with non-refcounted
	representations such as Pandas, Numpy or PyArrow objects. Check out
	`issue #13246
	<https://github.com/pytorch/pytorch/issues/13246#issuecomment-905703662>`_
	for more details on why this occurs and example code for how to
	workaround these problems.

	In this mode, each time an iterator of a :class:`~torch.utils.data.DataLoader`
	is created (e.g., when you call ``enumerate(dataloader)``), :attr:`num_workers`
	worker processes are created. At this point, the :attr:`dataset`,
	:attr:`collate_fn`, and :attr:`worker_init_fn` are passed to each
	worker, where they are used to initialize, and fetch data. This means that
	dataset access together with its internal IO, transforms
	(including :attr:`collate_fn`) runs in the worker process.

	:func:`torch.utils.data.get_worker_info()` returns various useful information
	in a worker process (including the worker id, dataset replica, initial seed,
	etc.), and returns ``None`` in main process. Users may use this function in
	dataset code and/or :attr:`worker_init_fn` to individually configure each
	dataset replica, and to determine whether the code is running in a worker
	process. For example, this can be particularly helpful in sharding the dataset.

	For map-style datasets, the main process generates the indices using
	:attr:`sampler` and sends them to the workers. So any shuffle randomization is
	done in the main process which guides loading by assigning indices to load.

	For iterable-style datasets, since each worker process gets a replica of the
	:attr:`dataset` object, naive multi-process loading will often result in
	duplicated data. Using :func:`torch.utils.data.get_worker_info()` and/or
	:attr:`worker_init_fn`, users may configure each replica independently. (See
	:class:`~torch.utils.data.IterableDataset` documentations for how to achieve
	this. ) For similar reasons, in multi-process loading, the :attr:`drop_last`
	argument drops the last non-full batch of each worker's iterable-style dataset
	replica.

	Workers are shut down once the end of the iteration is reached, or when the
	iterator becomes garbage collected.

	.. warning::
	It is generally not recommended to return CUDA tensors in multi-process
	loading because of many subtleties in using CUDA and sharing CUDA tensors in
	multiprocessing (see :ref:`multiprocessing-cuda-note`). Instead, we recommend
	using `automatic memory pinning <Memory Pinning_>`_ (i.e., setting
	:attr:`pin_memory=True`), which enables fast data transfer to CUDA-enabled
	GPUs.

	Platform-specific behaviors
	"""""""""""""""""""""""""""

	Since workers rely on Python :py:mod:`multiprocessing`, worker launch behavior is
	different on Windows compared to Unix.

	* On Unix, :func:`fork()` is the default :py:mod:`multiprocessing` start method.
	Using :func:`fork`, child workers typically can access the :attr:`dataset` and
	Python argument functions directly through the cloned address space.

	* On Windows or MacOS, :func:`spawn()` is the default :py:mod:`multiprocessing` start method.
	Using :func:`spawn()`, another interpreter is launched which runs your main script,
	followed by the internal worker function that receives the :attr:`dataset`,
	:attr:`collate_fn` and other arguments through :py:mod:`pickle` serialization.

	This separate serialization means that you should take two steps to ensure you
	are compatible with Windows while using multi-process data loading:

	- Wrap most of you main script's code within ``if __name__ == '__main__':`` block,
	to make sure it doesn't run again (most likely generating error) when each worker
	process is launched. You can place your dataset and :class:`~torch.utils.data.DataLoader`
	instance creation logic here, as it doesn't need to be re-executed in workers.

	- Make sure that any custom :attr:`collate_fn`, :attr:`worker_init_fn`
	or :attr:`dataset` code is declared as top level definitions, outside of the
	``__main__`` check. This ensures that they are available in worker processes.
	(this is needed since functions are pickled as references only, not ``bytecode``.)

	.. _data-loading-randomness:

	Randomness in multi-process data loading
	""""""""""""""""""""""""""""""""""""""""""

	By default, each worker will have its PyTorch seed set to ``base_seed + worker_id``,
	where ``base_seed`` is a long generated by main process using its RNG (thereby,
	consuming a RNG state mandatorily) or a specified :attr:`generator`. However, seeds for other
	libraries may be duplicated upon initializing workers, causing each worker to return
	identical random numbers. (See :ref:`this section <dataloader-workers-random-seed>` in FAQ.).

	In :attr:`worker_init_fn`, you may access the PyTorch seed set for each worker
	with either :func:`torch.utils.data.get_worker_info().seed <torch.utils.data.get_worker_info>`
	or :func:`torch.initial_seed()`, and use it to seed other libraries before data
	loading.

	Memory Pinning
	--------------

	Host to GPU copies are much faster when they originate from pinned (page-locked)
	memory. See :ref:`cuda-memory-pinning` for more details on when and how to use
	pinned memory generally.

	For data loading, passing :attr:`pin_memory=True` to a
	:class:`~torch.utils.data.DataLoader` will automatically put the fetched data
	Tensors in pinned memory, and thus enables faster data transfer to CUDA-enabled
	GPUs.

	The default memory pinning logic only recognizes Tensors and maps and iterables
	containing Tensors. By default, if the pinning logic sees a batch that is a
	custom type (which will occur if you have a :attr:`collate_fn` that returns a
	custom batch type), or if each element of your batch is a custom type, the
	pinning logic will not recognize them, and it will return that batch (or those
	elements) without pinning the memory. To enable memory pinning for custom
	batch or data type(s), define a :meth:`pin_memory` method on your custom
	type(s).

	See the example below.

	Example::

	class SimpleCustomBatch:
	def __init__(self, data):
	transposed_data = list(zip(*data))
	self.inp = torch.stack(transposed_data[0], 0)
	self.tgt = torch.stack(transposed_data[1], 0)

	# custom memory pinning method on custom type
	def pin_memory(self):
	self.inp = self.inp.pin_memory()
	self.tgt = self.tgt.pin_memory()
	return self

	def collate_wrapper(batch):
	return SimpleCustomBatch(batch)

	inps = torch.arange(10 * 5, dtype=torch.float32).view(10, 5)
	tgts = torch.arange(10 * 5, dtype=torch.float32).view(10, 5)
	dataset = TensorDataset(inps, tgts)

	loader = DataLoader(dataset, batch_size=2, collate_fn=collate_wrapper,
	pin_memory=True)

	for batch_ndx, sample in enumerate(loader):
	print(sample.inp.is_pinned())
	print(sample.tgt.is_pinned())


	.. autoclass:: DataLoader
	.. autoclass:: Dataset
	.. autoclass:: IterableDataset
	.. autoclass:: TensorDataset
	.. autoclass:: ConcatDataset
	.. autoclass:: ChainDataset
	.. autoclass:: Subset
	.. autofunction:: torch.utils.data._utils.collate.collate
	.. autofunction:: torch.utils.data.default_collate
	.. autofunction:: torch.utils.data.default_convert
	.. autofunction:: torch.utils.data.get_worker_info
	.. autofunction:: torch.utils.data.random_split
	.. autoclass:: torch.utils.data.Sampler
	.. autoclass:: torch.utils.data.SequentialSampler
	.. autoclass:: torch.utils.data.RandomSampler
	.. autoclass:: torch.utils.data.SubsetRandomSampler
	.. autoclass:: torch.utils.data.WeightedRandomSampler
	.. autoclass:: torch.utils.data.BatchSampler
	.. autoclass:: torch.utils.data.distributed.DistributedSampler


	.. These modules are documented as part of torch/data listing them here for
	.. now until we have a clearer fix
	.. py:module:: torch.utils.data.datapipes
	.. py:module:: torch.utils.data.datapipes.dataframe
	.. py:module:: torch.utils.data.datapipes.iter
	.. py:module:: torch.utils.data.datapipes.map
	.. py:module:: torch.utils.data.datapipes.utils