ARM(64): Implement the isInfinite intrinsics
The initial implementation replaced the HInvoke node in the graph
with several other HIRs based on the fact that the difference of
infinities of opposite signs is a NaN value, i.e. the nodes were
equivalent to the expression (x - x != x - x) && (x == x) (which
performs mostly floating-point operations). It was subsequently
abandoned in favor of another HIR implementation using the same
algorithm as the current assembly code (with mostly integer
operations), since it was faster in some simple microbenchmarks
(isInfinite() in a loop).
While the HIR approach had some significant advantages, such as
being architecture-neutral (so all architectures supported by the
compiler benefitted from the changes) and potentially enabling
further optimizations, it also had several limitations, the most
important being that it still needed a HInvoke node, which
defeated its purpose. The reason is that the algorithm requires a
raw conversion to an integer that preserves the bit representation
of the value, which seems not to be expressible in another way -
in particular, HTypeConversion does something entirely different.
Another major problem is that MIPS release 6 has specialized
floating-point classification instructions that are used in the
intrinsic implementation, and which the compiler is unable to use
in the general case (e.g. by recognizing a pattern in the graph),
so the HIR approach resulted in a regression. This could be solved
by doing architecture-specific optimizations earlier, but that
change is beyond the scope of this patch.
There were several other minor issues with the generated code
such as left shifts not being merged into comparisons on ARM64.
More importantly, on ARM Double.isInfinite() resulted in a
sequence of 14 instructions (compared to 6 in the current
implementation) due to the fact that a long is stored in a register
pair, so operations such as left shifts have to be done with two
instructions. This could be worked around by changing the HIR
representation at the cost of increased code complexity.
Given all these issues, the final decision was to implement the
intrinsics using the standard architecture-specific approach.
4 files changed