Fix batch norm multiplier init (#12325)
Summary:
Fixes #12259
Pull Request resolved: https://github.com/pytorch/pytorch/pull/12325
Differential Revision: D10203439
Pulled By: SsnL
fbshipit-source-id: 999cc134a45e2554313adb7eb93ee98e1f84335f
diff --git a/torch/nn/modules/batchnorm.py b/torch/nn/modules/batchnorm.py
index deb280a..393e44b 100644
--- a/torch/nn/modules/batchnorm.py
+++ b/torch/nn/modules/batchnorm.py
@@ -43,7 +43,7 @@
def reset_parameters(self):
self.reset_running_stats()
if self.affine:
- init.uniform_(self.weight)
+ init.ones_(self.weight)
init.zeros_(self.bias)
def _check_input_dim(self, input):
@@ -97,8 +97,8 @@
The mean and standard-deviation are calculated per-dimension over
the mini-batches and :math:`\gamma` and :math:`\beta` are learnable parameter vectors
- of size `C` (where `C` is the input size). By default, the elements of :math:`\gamma` are sampled
- from :math:`\mathcal{U}(0, 1)` and the elements of :math:`\beta` are set to 0.
+ of size `C` (where `C` is the input size). By default, the elements of :math:`\gamma` are set
+ to 1 and the elements of :math:`\beta` are set to 0.
Also by default, during training this layer keeps running estimates of its
computed mean and variance, which are then used for normalization during
@@ -169,8 +169,8 @@
The mean and standard-deviation are calculated per-dimension over
the mini-batches and :math:`\gamma` and :math:`\beta` are learnable parameter vectors
- of size `C` (where `C` is the input size). By default, the elements of :math:`\gamma` are sampled
- from :math:`\mathcal{U}(0, 1)` and the elements of :math:`\beta` are set to 0.
+ of size `C` (where `C` is the input size). By default, the elements of :math:`\gamma` are set
+ to 1 and the elements of :math:`\beta` are set to 0.
Also by default, during training this layer keeps running estimates of its
computed mean and variance, which are then used for normalization during
@@ -241,8 +241,8 @@
The mean and standard-deviation are calculated per-dimension over
the mini-batches and :math:`\gamma` and :math:`\beta` are learnable parameter vectors
- of size `C` (where `C` is the input size). By default, the elements of :math:`\gamma` are sampled
- from :math:`\mathcal{U}(0, 1)` and the elements of :math:`\beta` are set to 0.
+ of size `C` (where `C` is the input size). By default, the elements of :math:`\gamma` are set
+ to 1 and the elements of :math:`\beta` are set to 0.
Also by default, during training this layer keeps running estimates of its
computed mean and variance, which are then used for normalization during