commit	3410939459ac3f59d754434445ac217cb94136cd	[log] [tgz]
author	Aapo Kyrola <akyrola@fb.com>	Wed Nov 30 15:06:32 2016 -0800
committer	Bram Wasti <bwasti@dev11999.prn1.facebook.com>	Mon Dec 05 11:53:26 2016 -0800
tree	0d8c61aeb90f2e71094394d37a189e2c07fa453e
parent	a3942b2d64c2d47842852555860044077869ab8f [diff]

pass learning rate scaling factor to parameter update builder function

Summary:
When refactoring data parallel model, the division of LR by number of devices was dropped, and thus we ended up effectively multiplying gradients by the number of devices. Thus, we need to scale the LR by 1/numgpus.

Created a test to confirm that data_parallel_model produces exactly same results on different number of gpus, given the total batch size.

Reviewed By: prigoyal

Differential Revision: D4248907

fbshipit-source-id: af21ede113e6ac25f12c556de298cb18974548be

4 files changed

tree: 0d8c61aeb90f2e71094394d37a189e2c07fa453e

README.md

Caffe2

Caffe2 is a deep learning framework made with expression, speed, and modularity in mind. It is an experimental refactoring of Caffe, and allows a more flexible way to organize computation.

Read the installation instructions for installation details.

License and Citation

Caffe2 is released under the BSD 2-Clause license.