Ensure to kill all processes and reap them before unmounting During shutdown, init stops services by sending SIGTERM or SIGKILL to the process groups they lead. Child processes of a service by default belong to the process group of the service, so, stopping the service kills everything under the service. However, this "assumption" may not hold always. Process can create a new process group by calling setpgid(3). In this case, they can outlive their parent (the service) during the shutdown. Actually they get killed at the very last moment when init issues `echo i > /proc/sysrq-trigger`. And to make things worse, init doesn't reap for those rogue processes, presumably because it's in a somewhat emergency path. As a result, following catastrophic sequence of actions can occur right before kernel enters a reboot. 1. the rogue process has issued an I/O. Kernel is in the process of handling it. 2. init sends SIGKILL (via echo i > /proc/sysrq-trigger), but due to #1, it is not immediately killed until the I/O is done. 3. init (without waiting for the I/O to complete), try to unmount the partitions, and fails (as expected). 4. init, can't help but shut the underlying hardware down by issuing F2FS_IOC_SHUTDOWN. And then jumps to the kernel. 5. kernel may still see #1 ongoing. e.g. a lock may be held, ... This change tries to fix such an event by ensuring that all processes, even those in new process groups are killed and reaped before unmounting the partitions. Bug: 420528003 Test: follow the repro steps in the bug Flag: EXEMPT bug fix (cherry picked from https://googleplex-android-review.googlesource.com/q/commit:ae8ff5b26e40802d63705402cab38c7b3c40d3ac) Merged-In: I2988dd17844e25900dacbda1c293d4e3c269eb12 Change-Id: I2988dd17844e25900dacbda1c293d4e3c269eb12

commit: ff4908780c85345715639dbd4ee658032ee49d58 [log] [tgz]
author: Jiyong Park <jiyong@google.com> Sun Jun 08 21:58:05 2025 +0900
committer: Android Build Coastguard Worker <android-build-coastguard-worker@google.com> Wed Jun 11 17:27:56 2025 -0700
tree: 02452bfa5a1a1b09cce31c313cf699ac430d2e51
parent: 5e42e92bef7562724b352846fe6ada9a624467c9 [diff]
diff --git a/init/reboot.cpp b/init/reboot.cpp
index b3322f6..76787ea 100644
--- a/init/reboot.cpp
+++ b/init/reboot.cpp

@@ -346,18 +346,38 @@
     return UMOUNT_STAT_ERROR;
 }
 
-static UmountStat UmountPartitions(std::chrono::milliseconds timeout) {
-    // Terminate (SIGTERM) the services before unmounting partitions.
-    // If the processes block the signal, then partitions will eventually fail
-    // to unmount and then we fallback to SIGKILL the services.
-    //
-    // Hence, give the services a chance for a graceful shutdown before sending SIGKILL.
+static void KillAllProcesses(bool force) {
+    // SIGKILL on force == true. SIGTERM if not.
+    WriteStringToFile(force ? "i" : "e", PROC_SYSRQ);
+}
+
+static UmountStat UmountPartitions(std::chrono::milliseconds timeout, bool ota_update_in_progress) {
+    // If we have no time left, kill them all as fast as possible by sending SIGKILL. Otherwise
+    // SIGTERM so that they can gracefully exit.
+    bool immediate = timeout == 0ms;
+    // Terminate the services before unmounting partitions. If we have some time left, give them a
+    // chance for a graceful shutdown by sending SIGTERM. If not, kill immediately by sending
+    // SIGKILL.
     for (const auto& s : ServiceList::GetInstance()) {
         if (s->IsShutdownCritical()) {
             LOG(INFO) << "Shutdown service: " << s->name();
-            s->Terminate();
+            if (immediate) {
+                s->Timeout();
+            } else {
+                s->Terminate();
+            }
         }
     }
+    // Below is to ensure that all remaining processes (except init) are SIGKILL'ed or SIGTERM'ed.
+    // This is because some children of the services above might have created new process groups.
+    // Note that, each service by default is a process group leader, and we send a signal to the
+    // process group when killing the service. So, if some children created their own process group,
+    // they don't get killed. Below is to kill even such ones.
+    //
+    // However, if OTA update is in progress we NEVER send SIGKILL because snapuserd will be serving
+    // I/Os and therefore killing it will ruin the update. snapuserd ignores SIGTERM.
+    KillAllProcesses(immediate && !ota_update_in_progress);
+
     ReapAnyOutstandingChildren();
 
     Timer t;
@@ -366,12 +386,12 @@
      */
     while (true) {
         // force umount operation if timeout is not set
-        UmountStat stat = TryUmountPartitions(/*force=*/timeout == 0ms);
+        UmountStat stat = TryUmountPartitions(immediate);
         if (stat == UMOUNT_STAT_SUCCESS) {
             return UMOUNT_STAT_SUCCESS;
         }
 
-        if (stat == UMOUNT_STAT_NOT_AVAILABLE || timeout == 0ms) {
+        if (stat == UMOUNT_STAT_NOT_AVAILABLE || immediate) {
             return UMOUNT_STAT_ERROR;
         }
 
@@ -382,10 +402,6 @@
     }
 }
 
-static void KillAllProcesses() {
-    WriteStringToFile("i", PROC_SYSRQ);
-}
-
 // Reboot/shutdown monitor thread
 static void RebootMonitorThread(unsigned int cmd, const Timer& shutdown_timer) {
     // We want quite a long timeout here since the "sync" in the calling
@@ -521,7 +537,7 @@
             ota_update_in_progress = true;
         }
     }
-    UmountStat stat = UmountPartitions(timeout - t.duration());
+    UmountStat stat = UmountPartitions(timeout - t.duration(), ota_update_in_progress);
     if (stat != UMOUNT_STAT_SUCCESS) {
         // Do not delete: Critical log for reboot_fs_integrity_test.
         KLOG_INFO(LOG_TAG, "umount timeout, last resort, kill all and try");
@@ -542,7 +558,7 @@
             bool umount_dynamic_partitions = UmountDynamicPartitions(dynamic_partitions);
             LOG(INFO) << "Sending SIGTERM to all process";
             // Send SIGTERM to all processes except init
-            WriteStringToFile("e", PROC_SYSRQ);
+            KillAllProcesses(/* force */ false);
             // Wait for processes to terminate
             std::this_thread::sleep_for(1s);
             // Try one more attempt to umount other partitions which failed
@@ -552,9 +568,9 @@
             }
             return stat;
         }
-        KillAllProcesses();
+        KillAllProcesses(/* force */ true);
         // even if it succeeds, still it is timeout and do not run fsck with all processes killed
-        UmountStat st = UmountPartitions(0ms);
+        UmountStat st = UmountPartitions(0ms, ota_update_in_progress);
         if ((st != UMOUNT_STAT_SUCCESS) && DUMP_ON_UMOUNT_FAILURE) DumpUmountDebuggingInfo();
     }
commit	ff4908780c85345715639dbd4ee658032ee49d58	[log] [tgz]
author	Jiyong Park <jiyong@google.com>	Sun Jun 08 21:58:05 2025 +0900
committer	Android Build Coastguard Worker <android-build-coastguard-worker@google.com>	Wed Jun 11 17:27:56 2025 -0700
tree	02452bfa5a1a1b09cce31c313cf699ac430d2e51
parent	5e42e92bef7562724b352846fe6ada9a624467c9 [diff]