a couple small reliability improvements

Summary:
A couple of more misc changes:
- allow starting the coordinator multiple times -- this makes data parallel programming easier
- make the fetcher id a global sequence, before each gpu had same ids for workers
- my flow jobs got stuck when joining the fetcher threads. I think there is actually a memory fencing problem with the is_active boolean. But I am too tired to add proper condition variables there. Instead just add timeout to join(). It is needed anyway since some i/o thread could get blocked.

Differential Revision: D4333381

fbshipit-source-id: 88226c8a9c9a5e05d771360a502a2ba21a6b9d76
2 files changed
tree: 26181463481fe387661c9474a0f5d4265c53b052
  1. caffe/
  2. caffe2/
  3. docs/
  4. third_party/
  5. .Doxyfile
  6. .gitignore
  7. .gitmodules
  8. build.py
  9. build_android.py
  10. build_android_prepare.py
  11. LICENSE
  12. Makefile
  13. README.md
README.md

Caffe2

Caffe2 is a deep learning framework made with expression, speed, and modularity in mind. It is an experimental refactoring of Caffe, and allows a more flexible way to organize computation.

Read the installation instructions for installation details.

License and Citation

Caffe2 is released under the BSD 2-Clause license.