X86/X86_64: Switch to locked add from mfence
I finally received the answers about the performance of locked add vs.
mfence for Java memory semantics. Locked add has been faster than
mfence for all processors since the Pentium 4. Accordingly, I have made
the synchronization use locked add at all times, removing it from an
instruction set feature.
Also add support in the optimizing compiler for barrier type
kNTStoreStore, which is used after non-temporal moves.
Signed-off-by: Mark Mendell <email@example.com>
9 files changed