Speed up row-major matrix-vector product on ARM

The row-major matrix-vector multiplication code uses a threshold to
check if processing 8 rows at a time would thrash the cache.

This change introduces two modifications to this logic.

1. A smaller threshold for ARM and ARM64 devices.

The value of this threshold was determined empirically using a Pixel2
phone, by benchmarking a large number of matrix-vector products in the
range [1..4096]x[1..4096] and measuring performance separately on
small and little cores with frequency pinning.

On big (out-of-order) cores, this change has little to no impact. But
on the small (in-order) cores, the matrix-vector products are up to
700% faster. Especially on large matrices.

The motivation for this change was some internal code at Google which
was using hand-written NEON for implementing similar functionality,
processing the matrix one row at a time, which exhibited substantially
better performance than Eigen.

With the current change, Eigen handily beats that code.

2. Make the logic for choosing number of simultaneous rows apply
unifiormly to 8, 4 and 2 rows instead of just 8 rows.

Since the default threshold for non-ARM devices is essentially
unchanged (32000 -> 32 * 1024), this change has no impact on non-ARM
performance. This was verified by running the same set of benchmarks
on a Xeon desktop.
1 file changed
tree: cd324f6a7c070c2359b403f8d4867fd86b65a99b
  1. bench/
  2. blas/
  3. cmake/
  4. debug/
  5. demos/
  6. doc/
  7. Eigen/
  8. failtest/
  9. lapack/
  10. scripts/
  11. test/
  12. unsupported/
  13. .hgeol
  14. .hgignore
  15. CMakeLists.txt
  16. COPYING.BSD
  17. COPYING.GPL
  18. COPYING.LGPL
  19. COPYING.MINPACK
  20. COPYING.MPL2
  21. COPYING.README
  22. CTestConfig.cmake
  23. CTestCustom.cmake.in
  24. eigen3.pc.in
  25. INSTALL
  26. README.md
  27. signature_of_eigen3_matrix_library
README.md

Eigen is a C++ template library for linear algebra: matrices, vectors, numerical solvers, and related algorithms.

For more information go to http://eigen.tuxfamily.org/.

For pull request please only use the official repository at https://bitbucket.org/eigen/eigen.

For bug reports and feature requests go to http://eigen.tuxfamily.org/bz.