Turn "Rm" argument to floating point functions into immediate.

This would make it possible to reuse HostRounding intrinsics when
Rm == Dyn and Frm != RMM.

Even if we would test these flags for each instruction it should still
be beneficial since we would avoid heavy tool of calling C intrinsic.

This is especially important for FMA because we couldn't have fully-inline
FMA for RMM mode and compilers tend to reuse FMA instructions for
optimizations when they are available.

Bug: 278812060

Test: m berberis_all

Change-Id: Ifc896d502dd0d8a1f4c3935f1756838fe05ad9aa
4 files changed