time_in_state: prevent corruption of active CPU counts

On devices using suspend-to-ram, we can see back-to-back sched_switch
invocations with prev_pid==0 on the context switches immediately
preceding & following disabling/enabling a CPU. This breaks the active
CPU counting logic because counts will be incremented twice for each
CPU with no decrement in between.

Add a check for whether prev_pid is consistent with the next_pid seen
on the previous context switch. When old_last==0, skip the check since
this must be the first invocation on this cpu after attach, and the
pid in cpu_last_pid_map is either uninitialized (if this is the
initial attach after boot) or stale (if we're reattaching after a
system_server restart).

Also, add a test case to time_in_state_test to ensure that this case
is handled correctly.

Bug: 182272121
Test: let bramble suspend, then dump values in
/sys/fs/bpf/map_time_in_state_nr_active_map and
/sys/fs/bpf/map_time_in_state_policy_nr_active_map and check logcat
for negative deltas
Test: libtimeinstate_test
Test: bpf-time-in-state-test passes; new test case fails without the
fix
Signed-off-by: Connor O'Brien <connoro@google.com>

Change-Id: I7cffd127c8470a92328992535b6d387a2ec3211c
2 files changed
tree: 8f2f032034a875c93354eaece02073996dc81e0a
  1. test/
  2. Android.bp
  3. LICENSE
  4. METADATA
  5. OWNERS
  6. PREUPLOAD.cfg
  7. time_in_state.c
  8. time_in_state_test.cpp