instructions for JIT tail support on ARM

This adds a bunch of instructions we'll need to handle the N < 4 tail
within the JIT code on ARM.

   - ldrb/strb are 1-byte load and stores
   - sub subtracts without setting flags
   - cmp just sets flags (actually just subs with an xzr destination)
   - add b and b.lt, just like b.ne
   - cbz and cbnz... we only need cbz but I accidentally did cbnz first

Once I add support for forward jumps, we'll be able to use these
instructions to restructure the loop to

    entry:
        hoisted setup
    loop:
        if N < 4, jump tail      (cmp N,#4; b.lt tail)
        ... handle 4 values ...
        jump loop                (b loop)
    tail:
        if N == 0, jump end      (cbz N, end)
        ... handle 1 value ...
        jump tail                (b tail)
    end:
        ret

Change-Id: I62d2d190f670f758197a25d99dfde13362189993
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/226828
Reviewed-by: Herb Derby <herb@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
3 files changed