Intrinsify System.ArrayCopy for Primitive data types

This patch implements System.ArrayCopy intrinsic for
byte and int data types

14% improvement in microbench below:

public static void time_System_arrayCopy_byte(int reps) {
        byte[] src = new byte[8192];
        for (int rep = 0; rep < reps; ++rep) {
            byte[] dst = new byte[8192];
            System.arraycopy(src, 0, dst, 0, 8192);
        }
    }
public static void time_System_arrayCopy_byte(int reps) {
        int[] src = new int[8192];
        for (int rep = 0; rep < reps; ++rep) {
            int[] dst = new int[8192];
            System.arraycopy(src, 0, dst, 0, 8192);
        }
    }

Time for base version:      4057 ms
Time for intrinsic version: 3487 ms

Test: ./art/test/testrunner/testrunner.py --host --optimizing
Signed-off-by: Shalini Salomi Bodapati <shalini.salomi.bodapati@intel.com>
Change-Id: I87aced30330d031efea04554c6fa0c05f84e3bb9
8 files changed