commit | 631971e459502aba2e7fe9e73c2380495d7fe511 | [log] [tgz] |
---|---|---|
author | Aapo Kyrola <akyrola@fb.com> | Wed Sep 06 12:04:45 2017 -0700 |
committer | Facebook Github Bot <facebook-github-bot@users.noreply.github.com> | Wed Sep 06 12:26:30 2017 -0700 |
tree | a1073c4e2f447009bba10c75fa36d8f2a49e7450 | |
parent | ff38bbfe2cdb9353853a4206a5c5eaeb99e1013a [diff] |
threaded RNN executor for CPU, multi-stream executor CUDA Summary: Special executor for RNNs which can exploit parallelism over timesteps. For CPU we use multi-threading, achiving 3x or so improved on 4-layers LSTMs. With CUDA, perf improvements are more modest, but the structure allows for optimizing it further. For CUDA, we use multiple streams and events if there is parallellism over timesteps. In my experiments, it was not good to use more than 2 streams, though. Flag --caffe2_rnn_executor can be used to switch the executor off. Reviewed By: salexspb Differential Revision: D5749304 fbshipit-source-id: d6f76b3e16598be5b4e8188aff031671ebafaa4c
Caffe2 is a lightweight, modular, and scalable deep learning framework. Building on the original Caffe, Caffe2 is designed with expression, speed, and modularity in mind.
Caffe2 research award competition request for proposals
Please use Github issues (https://github.com/caffe2/caffe2/issues) to ask questions, report bugs, and request new features.
Please participate in our survey (https://www.surveymonkey.com/r/caffe2). We will send you information about new releases and special developer events/webinars.
Caffe2 is released under the BSD 2-Clause license.