Implement TopKOp for GPU

Summary:
This is a real implementation (not GPUFallbackOp) of the TopKOp for GPU.

There are two algorithm implementations:

-for k <= 512, it maps to a warp-wide min-heap implementation, which requires only a single scan of the input data.
-for k > 512, it maps to a multi-pass radix selection algorithm that I originally wrote in cutorch. I took the recent cutorch code and removed some cutorch-specific things as it made sense.

Also added several utility files that one or the other implementations use, some from the Faiss library and some from the cutorch library.

Reviewed By: jamesr66a

Differential Revision: D5248206

fbshipit-source-id: ae5fa3451473264293516c2838f1f40688781cf3
12 files changed
tree: 105e58826c2e0d80e9d7c55ecd0a64c662d2b7ca
  1. .travis/
  2. caffe/
  3. caffe2/
  4. cmake/
  5. docs/
  6. scripts/
  7. third_party/
  8. .Doxyfile
  9. .Doxyfile-c
  10. .Doxyfile-python
  11. .gitignore
  12. .gitmodules
  13. .travis.yml
  14. appveyor.yml
  15. CMakeLists.txt
  16. LICENSE
  17. Makefile
  18. PATENTS
  19. README.md
  20. release-notes.md
README.md

Caffe2

TravisCI Build Status Appveyor Build Status

Caffe2 is a lightweight, modular, and scalable deep learning framework. Building on the original Caffe, Caffe2 is designed with expression, speed, and modularity in mind.

Events

Caffe2 Bay Area Meetup at NVIDIA, May 31 6-8:30, Santa Clara, CA: https://www.meetup.com/Caffe2-Bay-Area/events/239836290/

User Groups

Caffe2 Community Facebook Group: join to ask questions, talk to other users, and keep informed of important Caffe2 updates.

Questions and Feedback

Please use Github issues (https://github.com/caffe2/caffe2/issues) to ask questions, report bugs, and request new features.

Please participate in our survey (https://www.surveymonkey.com/r/caffe2). We will send you information about new releases and special developer events/webinars.

License and Citation

Caffe2 is released under the BSD 2-Clause license.

Further Resources on Caffe2.ai