[scudo][standalone] Change the release loop for efficiency purposes

Summary:
On 32-b, the release algo loops multiple times over the freelist for a size
class, which lead to a decrease in performance when there were a lot of free
blocks.

This changes the release functions to loop only once over the freelist, at the
cost of using a little bit more memory for the release process: instead of
working on one region at a time, we pass the whole memory area covered by all
the regions for a given size class, and work on sub-areas of `RegionSize` in
this large area. For 64-b, we just have 1 sub-area encompassing the whole
region. Of course, not all the sub-areas within that large memory area will
belong to the class id we are working on, but those will just be left untouched
(which will not add to the RSS during the release process).

Reviewers: pcc, cferris, hctim, eugenis

Subscribers: llvm-commits, #sanitizers

Tags: #sanitizers

Differential Revision: https://reviews.llvm.org/D83993

GitOrigin-RevId: 998334da2b1536e7c8f11c560770c8d4cfacb354
Change-Id: I30ab4d1e77201dada1daa67c41811ec513c87462
Merged-In: I30ab4d1e77201dada1daa67c41811ec513c87462
(cherry picked from commit 8a35495108db564b14fdfcc69406407643197a5d)
4 files changed