f90d71055be - platform/external/mesa3d

commit	f90d71055be5ff6381479095448a606ec4018b93	[log] [tgz]
author	Ian Romanick <ian.d.romanick@intel.com>	Wed Feb 02 18:49:25 2022 -0800
committer	Marge Bot <emma+marge@anholt.net>	Tue Nov 08 00:02:16 2022 +0000
tree	a81b9af1c5ed14d120bc94c0159fcf97d2de4ce0
parent	9479e3a19b9e08b8525ba8b91a891b8cff03ace3 [diff]

intel/compiler: Add and use a pass to generate imul_32x16 instructions

Gfx8 and Gfx9 platforms are helped for cycles because now many
instructions like

    mul(8)          g12<1>D         g10<8,8,1>D     6D

become

    mul(8)          g12<1>D         g10<8,8,1>D     6W

It is the same number of instructions, but the 32x16 multiply is a
little faster.

v2: Fix transposed hi and lo in "(hi >= INT16_MIN && lo <= INT16_MAX)".
Noticed by Caio.  Use nir_src_is_const instead of open coding it.
Suggested by Caio.

Broadwell and Skylake had similar results. (Skylake shown)
total cycles in shared programs: 845748380 -> 845145547 (-0.07%)
cycles in affected programs: 446346348 -> 445743515 (-0.14%)
helped: 6017
HURT: 0
helped stats (abs) min: 2 max: 7380 x̄: 100.19 x̃: 8
helped stats (rel) min: <.01% max: 3.72% x̄: 0.41% x̃: 0.39%
95% mean confidence interval for cycles value: -113.37 -87.00
95% mean confidence interval for cycles %-change: -0.42% -0.41%
Cycles are helped.

Skylake
Cycles in all programs: 8844820715 -> 8828897462 (-0.2%)
Cycles helped: 47914
Cycles hurt: 1

No shader-db or fossil-db changes on any other Intel platform.

Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17718>

4 files changed

tree: a81b9af1c5ed14d120bc94c0159fcf97d2de4ce0