simpleperf: use 8 byte aligned stack size when resizing stack data.

When generating sample records for 32bit arm processes, the
dyn_stack_size may not be 8 byte aligned. And dyn_stack_size can be
used to calculate new stack size. If the new stack size isn't 8 byte
aligned, we can have alginment error later.

So make sure new stack size is 8 byte aligned.

Bug: 208187192
Bug: 210384787
Test: run simpleperf_unit_test
Change-Id: Ibc2f99ba766419fdb491c49317b9fb5ae13138ef
(cherry picked from commit 9290fc7b9d3609b854d2791428b2385bba427906)
diff --git a/simpleperf/RecordReadThread.cpp b/simpleperf/RecordReadThread.cpp
index 16af929..a822930 100644
--- a/simpleperf/RecordReadThread.cpp
+++ b/simpleperf/RecordReadThread.cpp
@@ -540,7 +540,7 @@
       // space in each sample to store stack data. However, a thread may use less stack than 64K.
       // So not all the 64K stack data in a sample is valid, and we only need to keep valid stack
       // data, whose size is dyn_stack_size.
-      uint64_t new_stack_size = std::min<uint64_t>(dyn_stack_size, stack_size_limit);
+      uint64_t new_stack_size = Align(std::min<uint64_t>(dyn_stack_size, stack_size_limit), 8);
       if (stack_size > new_stack_size) {
         // Remove part of the stack data.
         perf_event_header new_header = header;