commit | c535b8098f84bd3840740ef02748350d9fbb6eee | [log] [tgz] |
---|---|---|
author | Andrew Tulloch <tulloch@fb.com> | Mon Aug 21 12:42:03 2017 -0700 |
committer | Facebook Github Bot <facebook-github-bot@users.noreply.github.com> | Mon Aug 21 12:46:57 2017 -0700 |
tree | d3926878d28f9f0ad3aeb855f06d5212b4a92791 | |
parent | 77c28b7a7cf5dc7a29da209703f6bf90f8338f41 [diff] |
Add a HPTT path in transpose_op.cc Summary: https://arxiv.org/abs/1704.04374 is a simple, stateless library that implements a high performance tensor transposition abstraction - it's substantially faster than what we have. I think instead of going through an engine specialization on the CPU side, we can just add this path, since there's no value (in terms of state management, etc) for having it separate? We could cache the plan, but it's so cheap to create in these tests. Reviewed By: jonmorton Differential Revision: D5534519 fbshipit-source-id: de2fd64fee11be259656b0f02f42a62b7035e3d3
Caffe2 is a lightweight, modular, and scalable deep learning framework. Building on the original Caffe, Caffe2 is designed with expression, speed, and modularity in mind.
Caffe2 research award competition request for proposals
Please use Github issues (https://github.com/caffe2/caffe2/issues) to ask questions, report bugs, and request new features.
Please participate in our survey (https://www.surveymonkey.com/r/caffe2). We will send you information about new releases and special developer events/webinars.
Caffe2 is released under the BSD 2-Clause license.