[metrics] Add GC-work throughput metrics

Track work done (bytes processed) per second by the GC.

Some other minor changes:
1) Adjusted ConcurrentCopying class member's order to make access to
them more cache-access friendly. Counters accessed by GC-thread should
not be in the same cacheline as the one containing counters meant for
mutators, if either of the two modify those counters.
2) Increased max to 10'000 for throughput histograms in case
the throughput is > GB/s

Bug: 191404436
Test: manual
Merged-In: Iefaf1106690b6bae670a3a917f61194b3fcacfe0
Change-Id: Iefaf1106690b6bae670a3a917f61194b3fcacfe0
(cherry picked from commit 7f0473851d9a8d5644fde8c483390a985c238433)
8 files changed