ART: Fix race in on-stack replacement
The expected sequence of events for on-stack replacement is:
1. Method goes warm, triggering enhanced profiling
2. Method goes hot, triggering method compilation
3. Method goes really hot, triggering an osr method compilation.
4. Interpreter polls for the existence of an osr entry point,
and transitons to compiled code if found.
We have a race problem if #2 and #3 happen closely together, and
the osr method compilation begins before the regular method
compilation. In that case, the jit sees that the method is
already being compiled (the osr method - but it does not
distinguish the two), and discards the normal compilation request.
So, the osr version is compiled and the normal version is discarded.
In #4, the MaybeDoOnStackReplacement() check assumes that a normal
version of the compiled method must exist before doing an on-stack
replacement, so it keeps returning false.
This is why we were seeing sporadic timeout failures of
570-checker-osr when the mterp fast branch profiling was
introduced. The branch profiling performance enhancements
greatly reduced the time between #2 and #3, increasing the liklihood
of losing the race. Further, the new code clamped hotness to avoid
wrap-around. The race existed (and likely occurred) in the previous
version, but because hotness counters were allowed to overflow and
wrap around you'd eventually hit the threshold a second time and
try again - masking the problem.
Tip 'o the hat to Serguei Katkov for identifying the problem.
A possible solution (taken in this CL) is to differentiate osr
compilations from normal method compilations.
4 files changed