Option to use NCCL for broadcast

Summary:
Fixes some performance issues when `broadcast_computed_params=True` is passed to Parallelize_GPU. Enabled via the same `use_nccl` flag as AllReduce
Closes https://github.com/caffe2/caffe2/pull/630

Differential Revision: D5149828

Pulled By: akyrola

fbshipit-source-id: 12c9714c7fa078811f1cde61c8523dca8f7f968f
1 file changed
tree: 80b3a7f77c17c4becde5dd3a70baf99bbabe2c77
  1. .travis/
  2. caffe/
  3. caffe2/
  4. cmake/
  5. docs/
  6. scripts/
  7. third_party/
  8. .Doxyfile
  9. .Doxyfile-c
  10. .Doxyfile-python
  11. .gitignore
  12. .gitmodules
  13. .travis.yml
  14. appveyor.yml
  15. CMakeLists.txt
  16. LICENSE
  17. Makefile
  18. PATENTS
  19. README.md
  20. release-notes.md
README.md

Caffe2

Caffe2 is a lightweight, modular, and scalable deep learning framework. Building on the original Caffe, Caffe2 is designed with expression, speed, and modularity in mind.

Questions and Feedback

Please use Github issues (https://github.com/caffe2/caffe2/issues) to ask questions, report bugs, and request new features.

Please participate in our survey (https://www.surveymonkey.com/r/caffe2). We will send you information about new releases and special developer events/webinars.

License and Citation

Caffe2 is released under the BSD 2-Clause license.

Build Status

Travis Build Status Windows Build status

Detailed build matrix (hit refresh if you see icons not showing up due to heroku):

TargetStatus
LinuxBuild Linux
Mac (CPU)Build Mac
AndroidBuild Android
iOSBuild iOS
Linux + MKLBuild LinuxMKL
WindowsBuild status

Further Resources on Caffe2.ai