ART: Add CRC32.updateBytes intrinsic for ARM64

Use crc32 instructions for
java.util.zip.CRC32.updateBytes(int,byte[],int,int).

The intrinsic is used if a number of processed bytes is less or equal to
kCRC32UpdateBytesThreshold. If it exceeds kCRC32UpdateBytesThreshold the
core library provided function is used.

Note that CRC32 is an optional feature in ARMv8, this intrinsic
is only enabled for devices supporting the CRC32 instructions.

The CL is based on code from tim.zhang@linaro.org.

Performance improvements - speedup:
array size | Cortex-A53 | Cortex-A57
------------------------------------
128        | 14x        | 20x
256        | 10x        | 14x
512        | 8x         | 11x
1024       | 7x         | 9x
2048       | 6x         | 8x
4096       | 5x         | 7x
8192       | 5x         | 7x
16384      | 5x         | 7x
32768      | 5x         | 7x
65536      | 5x         | 7x

Test: m test-art-target-gtest
Test: m test-art-host-gtest
Test: art/test/testrunner/testrunner.py --target --optimizing --interpreter
Test: art/test/testrunner/testrunner.py --target --jit
Test: art/test/testrunner/testrunner.py --host --optimizing --interpreter
Test: art/test/testrunner/testrunner.py --host --jit
Test: 580-crc32

Change-Id: I0054cea41b5fc3e712e18b0afc7e3eacbf41feb6
11 files changed