commit | e7cd80718b04c03d5ce21f13981712704b36fc66 | [log] [tgz] |
---|---|---|
author | Yunqing Wang <yunqingwang@google.com> | Tue Nov 20 16:28:08 2012 -0800 |
committer | Yunqing Wang <yunqingwang@google.com> | Mon Nov 26 09:53:50 2012 -0800 |
tree | 492da7cfccaa2669a8656f9939237465430d96b6 | |
parent | f42e41f2eff366338f8f7b36d5b6f8c9c5a26573 [diff] |
Improve sad3x16 SSE2 function Vp9_sad3x16_sse2() is heavily called in decoder, in which the unaligned reads consume lots of cpu cycles. When CONFIG_SUBPELREFMV is off, the unaligned offset is 1. In this situation, we can adjust the src_ptr to be 4-byte aligned, and then do the aligned reads. This reduced the reading time significantly. Tests on 1080p clip showed over 2% decoder performance gain with CONFIG_SUBPELREFM off. Change-Id: I953afe3ac5406107933ef49d0b695eafba9a6507