commit | ee1f21a53e499a4804a9ed5a3274d03edbaecb03 | [log] [tgz] |
---|---|---|
author | Aapo Kyrola <akyrola@fb.com> | Tue Jun 27 15:04:29 2017 -0700 |
committer | Facebook Github Bot <facebook-github-bot@users.noreply.github.com> | Tue Jun 27 15:27:28 2017 -0700 |
tree | 81c5919c1abac490804b0875e7ccf139a1daf8f7 | |
parent | 81f539a28307aca4d92feb3843f7a650138ab45c [diff] |
fix perf bug in TransposeOp for CUDA Summary: It was allocating TensorCPU always, so causing mutex to be acquired in PinnedCPUAllocator. Not much impact as everyone should use the CUDNN transpose, but good to fix anyway. Reviewed By: jamesr66a Differential Revision: D5332858 fbshipit-source-id: 287643df623b7cd59ab1028ed8b2ed1d3c1da44e
Caffe2 is a lightweight, modular, and scalable deep learning framework. Building on the original Caffe, Caffe2 is designed with expression, speed, and modularity in mind.
Caffe2 research award competition request for proposals
Please use Github issues (https://github.com/caffe2/caffe2/issues) to ask questions, report bugs, and request new features.
Please participate in our survey (https://www.surveymonkey.com/r/caffe2). We will send you information about new releases and special developer events/webinars.
Caffe2 is released under the BSD 2-Clause license.