Optimize the mterp field getter.

Carefully optimize the C++ helper function to make it faster.
Use the macro less by using templated helper methods.

This reduces the overhead of non-quickened code from 1.45x to 1.35x.
(golem benchmarks on arm64 with quickening manually disabled)

Test: test.py --host
Change-Id: I1904e1edcb14573ac247c552c9b73ae704c57217
1 file changed