| commit | 9a5cbfbc087210eabfac5b0c2d72d12852ac56ae | [log] [tgz] |
|---|---|---|
| author | Salome Thirot <salome.thirot@arm.com> | Fri Feb 03 11:00:19 2023 +0000 |
| committer | Salome Thirot <salome.thirot@arm.com> | Mon Feb 06 15:54:57 2023 +0000 |
| tree | f9df00cbd19bd15268456ba81bc1db1d4afc4fa6 | |
| parent | e3028ddbb408381601ab8d2c67be37124a9726e5 [diff] |
Optimize Neon implementation of high bitdepth avg SAD functions Optimizations take a similar form to those implemented for standard bitdepth averaging SAD: - Use ABD, UADALP instead of ABAL, ABAL2 (double the throughput on modern out-of-order Arm-designed cores.) - Use more accumulator registers to make better use of Neon pipeline resources on Arm CPUs that have four Neon pipes. Change-Id: I75c5f09948f6bf17200f82e00e7a827a80451108