Correct monitor pool synchronization

The previous implementation allowed a thread looking up a monitor
to see an uninitialized monitor_chunks_ list if the list had
just been resized. The obvious small fix would be to replace the
relaxed load in LookupMonitor with an acquire load. But the
extra fence (on ARM) may involve an appreciable performance hit.

This instead redesigns the data structure to avoid the race in
LookupMonitor, along with the need to use atomics there at all. The
down side is a little more address arithmetic in LookupMonitor(),
a mild decrease in the limit on the total number of monitors, and
use of one extra page, since we now always reserve space for the
first page worth of monitor chunk pointers.

To me, the new algorithm feels cleaner and easier to reason about.

Although this problem was externally reported, it seems unlikely
that it was responsible for frequent failures. It could only
be triggered when the monitor chunk list was resized, which should
be quite rare.

Bug: 28385279
Change-Id: I433155d91702878f6b114480eda1fbf09706f623
2 files changed