| commit | 48943bb378f50c4af98b338f7f282354a16f9507 | [log] [tgz] |
|---|---|---|
| author | Frank Barchard <fbarchard@google.com> | Fri Aug 01 16:19:12 2025 -0700 |
| committer | Frank Barchard <fbarchard@google.com> | Mon Aug 04 12:42:50 2025 -0700 |
| tree | 977b64273a383c4b28ebea77ba1b071eace71922 | |
| parent | cdd3bae84818e78466fec1ce954eead8f403d10c [diff] |
Convert8To16 use VPSRLW instead of VPMULHUW for better lunarlake performance - MCA says old version was 4 cycles and new version is 2.5 cycles/loop - lunarlake is the only known cpu mca -mcpu=lunarlake 100 iterations Was vpmulhu Iterations: 100 Instructions: 1200 Total Cycles: 426 Total uOps: 1200 Dispatch Width: 8 uOps Per Cycle: 2.82 IPC: 2.82 Block RThroughput: 4.0 Now vpsrlw Iterations: 100 Instructions: 1200 Total Cycles: 279 Total uOps: 1400 Dispatch Width: 8 uOps Per Cycle: 5.02 IPC: 4.30 Block RThroughput: 2.5 Bug: None Change-Id: I5a49e1cf1ed3dfb59fe9861a871df9862417c6a6 Reviewed-on: https://chromium-review.googlesource.com/c/libyuv/libyuv/+/6697745 Reviewed-by: richard winterton <rrwinterton@gmail.com>
libyuv is an open source project that includes YUV scaling and conversion functionality.
See Getting started for instructions on how to get started developing.
You can also browse the docs directory for more documentation.