android12-5.10 July 2022 release 6

Artifacts:
  https://ci.android.com/builds/submitted/9176214/kernel_aarch64/latest
UPSTREAM: psi: Fix psi state corruption when schedule() races with cgroup move

4117cebf1a9f ("psi: Optimize task switch inside shared cgroups")
introduced a race condition that corrupts internal psi state. This
manifests as kernel warnings, sometimes followed by bogusly high IO
pressure:

  psi: task underflow! cpu=1 t=2 tasks=[0 0 0 0] clear=c set=0
  (schedule() decreasing RUNNING and ONCPU, both of which are 0)

  psi: incosistent task state! task=2412744:systemd cpu=17 psi_flags=e clear=3 set=0
  (cgroup_move_task() clearing MEMSTALL and IOWAIT, but task is MEMSTALL | RUNNING | ONCPU)

What the offending commit does is batch the two psi callbacks in
schedule() to reduce the number of cgroup tree updates. When prev is
deactivated and removed from the runqueue, nothing is done in psi at
first; when the task switch completes, TSK_RUNNING and TSK_IOWAIT are
updated along with TSK_ONCPU.

However, the deactivation and the task switch inside schedule() aren't
atomic: pick_next_task() may drop the rq lock for load balancing. When
this happens, cgroup_move_task() can run after the task has been
physically dequeued, but the psi updates are still pending. Since it
looks at the task's scheduler state, it doesn't move everything to the
new cgroup that the task switch that follows is about to clear from
it. cgroup_move_task() will leak the TSK_RUNNING count in the old
cgroup, and psi_sched_switch() will underflow it in the new cgroup.

A similar thing can happen for iowait. TSK_IOWAIT is usually set when
a p->in_iowait task is dequeued, but again this update is deferred to
the switch. cgroup_move_task() can see an unqueued p->in_iowait task
and move a non-existent TSK_IOWAIT. This results in the inconsistent
task state warning, as well as a counter underflow that will result in
permanent IO ghost pressure being reported.

Fix this bug by making cgroup_move_task() use task->psi_flags instead
of looking at the potentially mismatching scheduler state.

[ We used the scheduler state historically in order to not rely on
  task->psi_flags for anything but debugging. But that ship has sailed
  anyway, and this is simpler and more robust.

  We previously already batched TSK_ONCPU clearing with the
  TSK_RUNNING update inside the deactivation call from schedule(). But
  that ordering was safe and didn't result in TSK_ONCPU corruption:
  unlike most places in the scheduler, cgroup_move_task() only checked
  task_current() and handled TSK_ONCPU if the task was still queued. ]

Bug: 253347377
Bug: 253519189
Fixes: 4117cebf1a9f ("psi: Optimize task switch inside shared cgroups")
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210503174917.38579-1-hannes@cmpxchg.org
(cherry picked from commit d583d360a620e6229422b3455d0be082b8255f5e)
Change-Id: Id0a292058d4bffb716d8e1496f72139e8d435410
1 file changed
tree: 1b1f754324c7874f3a96c66f01d33e16ce698a88
  1. android/
  2. arch/
  3. block/
  4. certs/
  5. crypto/
  6. Documentation/
  7. drivers/
  8. fs/
  9. include/
  10. init/
  11. ipc/
  12. kernel/
  13. lib/
  14. LICENSES/
  15. mm/
  16. net/
  17. samples/
  18. scripts/
  19. security/
  20. sound/
  21. tools/
  22. usr/
  23. virt/
  24. .clang-format
  25. .cocciconfig
  26. .get_maintainer.ignore
  27. .gitattributes
  28. .gitignore
  29. .mailmap
  30. build.config.aarch64
  31. build.config.allmodconfig
  32. build.config.allmodconfig.aarch64
  33. build.config.allmodconfig.arm
  34. build.config.allmodconfig.x86_64
  35. build.config.amlogic
  36. build.config.arm
  37. build.config.common
  38. build.config.db845c
  39. build.config.gki
  40. build.config.gki-debug.aarch64
  41. build.config.gki-debug.x86_64
  42. build.config.gki.aarch64
  43. build.config.gki.aarch64.fips140
  44. build.config.gki.aarch64.fips140_eval_testing
  45. build.config.gki.x86_64
  46. build.config.gki_kasan
  47. build.config.gki_kasan.aarch64
  48. build.config.gki_kasan.x86_64
  49. build.config.gki_kprobes
  50. build.config.gki_kprobes.aarch64
  51. build.config.gki_kprobes.x86_64
  52. build.config.hikey960
  53. build.config.khwasan
  54. build.config.rockchip
  55. build.config.x86_64
  56. COPYING
  57. CREDITS
  58. Kbuild
  59. Kconfig
  60. MAINTAINERS
  61. Makefile
  62. OWNERS
  63. README
  64. README.md
README.md

How do I submit patches to Android Common Kernels

  1. BEST: Make all of your changes to upstream Linux. If appropriate, backport to the stable releases. These patches will be merged automatically in the corresponding common kernels. If the patch is already in upstream Linux, post a backport of the patch that conforms to the patch requirements below.

    • Do not send patches upstream that contain only symbol exports. To be considered for upstream Linux, additions of EXPORT_SYMBOL_GPL() require an in-tree modular driver that uses the symbol -- so include the new driver or changes to an existing driver in the same patchset as the export.
    • When sending patches upstream, the commit message must contain a clear case for why the patch is needed and beneficial to the community. Enabling out-of-tree drivers or functionality is not not a persuasive case.
  2. LESS GOOD: Develop your patches out-of-tree (from an upstream Linux point-of-view). Unless these are fixing an Android-specific bug, these are very unlikely to be accepted unless they have been coordinated with kernel-team@android.com. If you want to proceed, post a patch that conforms to the patch requirements below.

Common Kernel patch requirements

  • All patches must conform to the Linux kernel coding standards and pass script/checkpatch.pl
  • Patches shall not break gki_defconfig or allmodconfig builds for arm, arm64, x86, x86_64 architectures (see https://source.android.com/setup/build/building-kernels)
  • If the patch is not merged from an upstream branch, the subject must be tagged with the type of patch: UPSTREAM:, BACKPORT:, FROMGIT:, FROMLIST:, or ANDROID:.
  • All patches must have a Change-Id: tag (see https://gerrit-review.googlesource.com/Documentation/user-changeid.html)
  • If an Android bug has been assigned, there must be a Bug: tag.
  • All patches must have a Signed-off-by: tag by the author and the submitter

Additional requirements are listed below based on patch type

Requirements for backports from mainline Linux: UPSTREAM:, BACKPORT:

  • If the patch is a cherry-pick from Linux mainline with no changes at all
    • tag the patch subject with UPSTREAM:.
    • add upstream commit information with a (cherry picked from commit ...) line
    • Example:
      • if the upstream commit message is
        important patch from upstream

        This is the detailed description of the important patch

        Signed-off-by: Fred Jones <fred.jones@foo.org>
  • then Joe Smith would upload the patch for the common kernel as
        UPSTREAM: important patch from upstream

        This is the detailed description of the important patch

        Signed-off-by: Fred Jones <fred.jones@foo.org>

        Bug: 135791357
        Change-Id: I4caaaa566ea080fa148c5e768bb1a0b6f7201c01
        (cherry picked from commit c31e73121f4c1ec41143423ac6ce3ce6dafdcec1)
        Signed-off-by: Joe Smith <joe.smith@foo.org>
  • If the patch requires any changes from the upstream version, tag the patch with BACKPORT: instead of UPSTREAM:.
    • use the same tags as UPSTREAM:
    • add comments about the changes under the (cherry picked from commit ...) line
    • Example:
        BACKPORT: important patch from upstream

        This is the detailed description of the important patch

        Signed-off-by: Fred Jones <fred.jones@foo.org>

        Bug: 135791357
        Change-Id: I4caaaa566ea080fa148c5e768bb1a0b6f7201c01
        (cherry picked from commit c31e73121f4c1ec41143423ac6ce3ce6dafdcec1)
        [joe: Resolved minor conflict in drivers/foo/bar.c ]
        Signed-off-by: Joe Smith <joe.smith@foo.org>

Requirements for other backports: FROMGIT:, FROMLIST:,

  • If the patch has been merged into an upstream maintainer tree, but has not yet been merged into Linux mainline
    • tag the patch subject with FROMGIT:
    • add info on where the patch came from as (cherry picked from commit <sha1> <repo> <branch>). This must be a stable maintainer branch (not rebased, so don't use linux-next for example).
    • if changes were required, use BACKPORT: FROMGIT:
    • Example:
      • if the commit message in the maintainer tree is
        important patch from upstream

        This is the detailed description of the important patch

        Signed-off-by: Fred Jones <fred.jones@foo.org>
  • then Joe Smith would upload the patch for the common kernel as
        FROMGIT: important patch from upstream

        This is the detailed description of the important patch

        Signed-off-by: Fred Jones <fred.jones@foo.org>

        Bug: 135791357
        (cherry picked from commit 878a2fd9de10b03d11d2f622250285c7e63deace
         https://git.kernel.org/pub/scm/linux/kernel/git/foo/bar.git test-branch)
        Change-Id: I4caaaa566ea080fa148c5e768bb1a0b6f7201c01
        Signed-off-by: Joe Smith <joe.smith@foo.org>
  • If the patch has been submitted to LKML, but not accepted into any maintainer tree
    • tag the patch subject with FROMLIST:
    • add a Link: tag with a link to the submittal on lore.kernel.org
    • add a Bug: tag with the Android bug (required for patches not accepted into a maintainer tree)
    • if changes were required, use BACKPORT: FROMLIST:
    • Example:
        FROMLIST: important patch from upstream

        This is the detailed description of the important patch

        Signed-off-by: Fred Jones <fred.jones@foo.org>

        Bug: 135791357
        Link: https://lore.kernel.org/lkml/20190619171517.GA17557@someone.com/
        Change-Id: I4caaaa566ea080fa148c5e768bb1a0b6f7201c01
        Signed-off-by: Joe Smith <joe.smith@foo.org>

Requirements for Android-specific patches: ANDROID:

  • If the patch is fixing a bug to Android-specific code
    • tag the patch subject with ANDROID:
    • add a Fixes: tag that cites the patch with the bug
    • Example:
        ANDROID: fix android-specific bug in foobar.c

        This is the detailed description of the important fix

        Fixes: 1234abcd2468 ("foobar: add cool feature")
        Change-Id: I4caaaa566ea080fa148c5e768bb1a0b6f7201c01
        Signed-off-by: Joe Smith <joe.smith@foo.org>
  • If the patch is a new feature
    • tag the patch subject with ANDROID:
    • add a Bug: tag with the Android bug (required for android-specific features)