commit | c691fc6dc711814a06107d4a9b763f34bff5afca | [log] [tgz] |
---|---|---|
author | Christian Sarofeen <csarofeen@nvidia.com> | Sun Jul 02 21:39:40 2017 -0700 |
committer | Soumith Chintala <soumith@gmail.com> | Mon Jul 03 00:39:40 2017 -0400 |
tree | 95d7fc53d2fd1bcb75c3131ae1097116e1e02c29 | |
parent | 42cf68b4028252b5ac9b3c93a6111e5ef9c6cd9b [diff] |
Add a nonContigDim reduction kernel to improve latency for small tensors. (#768)