Optimize x86 long V*V by skipping imul

The algorithm for long multiplication can take advantage of the fact
that we are multiplying a value by itself by converting 1L*2H + 2L*1H
into (2H*1L)+(2H*1L), thus converting a multiply into an addition.

Change-Id: I259a25699a8787badd943318e99bafdd06587ec6
Signed-off-by: Mark Mendell <mark.p.mendell@intel.com>
1 file changed