ANDROID: sched/tune: fix boost_group spin_lock re-initialization

In:

  c5616f2f874   ANDROID: sched/tune: Initialize raw_spin_lock
                in boosted_groups

a spin_lock is initialized on each CPU every time a boost_group is
activated (i.e. a cgroup created).

However, those spin_lock are already initialized at boot time by
schedtune_init_cgroups() which also set schedtune_initialized to true
thus enabling the tasks accounting done by schedtune_{en,de}queue_task().
This means that an already initialize and used spin_lock is wrongly
re-initialized thus potentially leading to a:
   BUG: spinlock already unlocked on CPU
in the (unlikely in AOSP) case we have a race between schedtune cgroups
creation and tasks enqueue/dequeue.

This probably happened because the fix provided by c5616f2f874 was just
the wrong cure for a different issues: the missing one-time
initialization of the per-CPU spinlocks in schedtune_init_cgroups().
All these fixes happened on v4.4 and have been forward ported to the
current v4.9 base.

Let's better fix this by:
- removing the not necessary spinlock re-initialization in:
  schedtune_boostgroup_init()
- add a new "valid" flag to better flag which boost_groups are currently
  used to track a valid cgroup.

This patch adds also a better documentation of the used data structures
and the locking strategy in use to synchronize fast and slow paths.

Change-Id: I3c2a256693b12b317373bbc032ed46e620f79ee8
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Reported-by: Stanley Shih <stanley.shih@mstarsemi.com>

[
The first of the two fixes listed above has been already
merged by:

   commit f6bec4e8c771 ("Revert "ANDROID: sched/tune:
                         Initialize raw_spin_lock in boosted_groups")

which, in conjunction with:

   commit 751e50939134 ("ANDROID: sched: tune:
                         Fix lacking spinlock initialization")

provides the correct initialization of the boostgroups spinlocks.

Let's keep the changelog as it is to better keep track of the original
intended fix as well as to better document the required locking strategy.
]
Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
diff --git a/kernel/sched/tune.c b/kernel/sched/tune.c
index 62f3085..fbdd00a 100644
--- a/kernel/sched/tune.c
+++ b/kernel/sched/tune.c
@@ -108,6 +108,64 @@
 
 /*
  * EAS scheduler tunables for task groups.
+ *
+ * When CGroup support is enabled, we have to synchronize two different
+ * paths:
+ *  - slow path: where CGroups are created/updated/removed
+ *  - fast path: where tasks in a CGroups are accounted
+ *
+ * The slow path tracks (a limited number of) CGroups and maps each on a
+ * "boost_group" index. The fastpath accounts tasks currently RUNNABLE on each
+ * "boost_group".
+ *
+ * Once a new CGroup is created, a boost group idx is assigned and the
+ * corresponding "boost_group" marked as valid on each CPU.
+ * Once a CGroup is release, the corresponding "boost_group" is marked as
+ * invalid on each CPU. The CPU boost value (boost_max) is aggregated by
+ * considering only valid boost_groups with a non null tasks counter.
+ *
+ * .:: Locking strategy
+ *
+ * The fast path uses a spin lock for each CPU boost_group which protects the
+ * tasks counter.
+ *
+ * The "valid" and "boost" values of each CPU boost_group is instead
+ * protected by the RCU lock provided by the CGroups callbacks. Thus, only the
+ * slow path can access and modify the boost_group attribtues of each CPU.
+ * The fast path will catch up the most updated values at the next scheduling
+ * event (i.e. enqueue/dequeue).
+ *
+ *                                                        |
+ *                                             SLOW PATH  |   FAST PATH
+ *                              CGroup add/update/remove  |   Scheduler enqueue/dequeue events
+ *                                                        |
+ *                                                        |
+ *                                                        |     DEFINE_PER_CPU(struct boost_groups)
+ *                                                        |     +--------------+----+---+----+----+
+ *                                                        |     |  idle        |    |   |    |    |
+ *                                                        |     |  boost_max   |    |   |    |    |
+ *                                                        |  +---->lock        |    |   |    |    |
+ *  struct schedtune                  allocated_groups    |  |  |  group[    ] |    |   |    |    |
+ *  +------------------------------+         +-------+    |  |  +--+---------+-+----+---+----+----+
+ *  | idx                          |         |       |    |  |     |  valid  |
+ *  | boots / prefer_idle          |         |       |    |  |     |  boost  |
+ *  | perf_{boost/constraints}_idx | <---------+(*)  |    |  |     |  tasks  | <------------+
+ *  | css                          |         +-------+    |  |     +---------+              |
+ *  +-+----------------------------+         |       |    |  |     |         |              |
+ *    ^                                      |       |    |  |     |         |              |
+ *    |                                      +-------+    |  |     +---------+              |
+ *    |                                      |       |    |  |     |         |              |
+ *    |                                      |       |    |  |     |         |              |
+ *    |                                      +-------+    |  |     +---------+              |
+ *    | zmalloc                              |       |    |  |     |         |              |
+ *    |                                      |       |    |  |     |         |              |
+ *    |                                      +-------+    |  |     +---------+              |
+ *    +                              BOOSTGROUPS_COUNT    |  |     BOOSTGROUPS_COUNT        |
+ *  schedtune_boostgroup_init()                           |  +                              |
+ *                                                        |  schedtune_{en,de}queue_task()  |
+ *                                                        |                                 +
+ *                                                        |          schedtune_tasks_update()
+ *                                                        |
  */
 
 /* SchdTune tunables for a group of tasks */
@@ -226,6 +284,8 @@
 	/* Maximum boost value for all RUNNABLE tasks on a CPU */
 	int boost_max;
 	struct {
+		/* True when this boost group maps an actual cgroup */
+		bool valid;
 		/* The boost for tasks on that boost group */
 		int boost;
 		/* Count of RUNNABLE tasks on that boost group */
@@ -250,6 +310,11 @@
 	/* The root boost group is always active */
 	boost_max = bg->group[0].boost;
 	for (idx = 1; idx < BOOSTGROUPS_COUNT; ++idx) {
+
+		/* Ignore non boostgroups not mapping a cgroup */
+		if (!bg->group[idx].valid)
+			continue;
+
 		/*
 		 * A boost group affects a CPU only if it has
 		 * RUNNABLE tasks on that CPU
@@ -259,6 +324,7 @@
 
 		boost_max = max(boost_max, bg->group[idx].boost);
 	}
+
 	/* Ensures boost_max is non-negative when all cgroup boost values
 	 * are neagtive. Avoids under-accounting of cpu capacity which may cause
 	 * task stacking and frequency spikes.*/
@@ -278,6 +344,9 @@
 	for_each_possible_cpu(cpu) {
 		bg = &per_cpu(cpu_boost_groups, cpu);
 
+		/* CGroups are never associated to non active cgroups */
+		BUG_ON(!bg->group[idx].valid);
+
 		/*
 		 * Keep track of current boost values to compute the per CPU
 		 * maximum only when it has been affected by the new value of
@@ -642,7 +711,7 @@
 	for_each_possible_cpu(cpu) {
 		bg = &per_cpu(cpu_boost_groups, cpu);
 		bg->group[idx].boost = 0;
-		bg->group[idx].tasks = 0;
+		bg->group[idx].valid = true;
 	}
 
 	/* Keep track of allocated boost groups */
@@ -697,6 +766,7 @@
 	/* Reset per CPUs boost group support */
 	for_each_possible_cpu(cpu) {
 		bg = &per_cpu(cpu_boost_groups, cpu);
+		bg->group[st->idx].valid = false;
 		bg->group[st->idx].boost = 0;
 	}
 
@@ -733,6 +803,7 @@
 	for_each_possible_cpu(cpu) {
 		bg = &per_cpu(cpu_boost_groups, cpu);
 		memset(bg, 0, sizeof(struct boost_groups));
+		bg->group[0].valid = true;
 		raw_spin_lock_init(&bg->lock);
 	}