commit | 3410939459ac3f59d754434445ac217cb94136cd | [log] [tgz] |
---|---|---|
author | Aapo Kyrola <akyrola@fb.com> | Wed Nov 30 15:06:32 2016 -0800 |
committer | Bram Wasti <bwasti@dev11999.prn1.facebook.com> | Mon Dec 05 11:53:26 2016 -0800 |
tree | 0d8c61aeb90f2e71094394d37a189e2c07fa453e | |
parent | a3942b2d64c2d47842852555860044077869ab8f [diff] |
pass learning rate scaling factor to parameter update builder function Summary: When refactoring data parallel model, the division of LR by number of devices was dropped, and thus we ended up effectively multiplying gradients by the number of devices. Thus, we need to scale the LR by 1/numgpus. Created a test to confirm that data_parallel_model produces exactly same results on different number of gpus, given the total batch size. Reviewed By: prigoyal Differential Revision: D4248907 fbshipit-source-id: af21ede113e6ac25f12c556de298cb18974548be
Caffe2 is a deep learning framework made with expression, speed, and modularity in mind. It is an experimental refactoring of Caffe, and allows a more flexible way to organize computation.
Read the installation instructions for installation details.
Caffe2 is released under the BSD 2-Clause license.