Fix RecurrentNetworkGradient with batch size > 1

Summary:
Fix RecurrentNetworkGradient with batch size > 1.
The main issue was that we always set the Gradient output to 1, 1, recurrent_size which mismatch the input (1, batch_size, recurrent_size).
Further gradient ops do Squeeze and split assuming that output gradient blob is the same size as the input so they fail.

The fix is simply Resizing the output as the input (1, batch_size, recurrent_size), I had to move the resize to the RunOnDevice since batch_size is computed from Input(0) which is not available till the we actually run the op.

Differential Revision: D4301487

fbshipit-source-id: e5c7426d6e770d985ce72a3737381a2b4af333ba
1 file changed
tree: 51203c6f85068cc9d75a45c74dd887545280d4c2
  1. caffe/
  2. caffe2/
  3. docs/
  4. third_party/
  5. .Doxyfile
  6. .gitignore
  7. .gitmodules
  8. build.py
  9. build_android.py
  10. build_android_prepare.py
  11. LICENSE
  12. Makefile
  13. README.md
README.md

Caffe2

Caffe2 is a deep learning framework made with expression, speed, and modularity in mind. It is an experimental refactoring of Caffe, and allows a more flexible way to organize computation.

Read the installation instructions for installation details.

License and Citation

Caffe2 is released under the BSD 2-Clause license.