Opt compiler: Speedup div/rem by constants on arm32 and arm64.

This patch also includes:
1. Add java test for div/rem negative constants.
2. Fix a thumb2 encoding issue where the last operand is
   "reg, shift #amount" in some instructions.
3. Support a simple filter in arm32 assembler test to filter out
   unsupported cases, such as "smull r0, r0, r1, r2".
4. Add smull arm32 assembler test.
5. Add smull/umull thumb2 test.
6. Add test for the thumb2 encoding issue which is fixed in this

Change-Id: I1601bc9c38f70f11909f2816fe3ec105a158951e
16 files changed