Use entrypoint switching to reduce code size of GcRoot read barrier

Set the read barrier mark register entrypoints to null when the GC
is not marking. The compiler uses this to avoid needing to load the
is_gc_marking boolean.

Code size results on ritzperf CC:
arm32: 13439400 -> 13242792 (-1.5%)
arm64: 16380544 -> 16208512 (-1.05%)

Implemented for arm32 and arm64. TODO: Consider implementing on x86.

Bug: 32638713
Bug: 29516974

Test: test-art-host + run ritzperf
Change-Id: I527ca5dc4cd43950ba43b872d0ac81e1eb5791eb
12 files changed