en/devices/tech/ota/ab/index.html - platform/docs/source.android.com - Git at Google

 <html devsite>
   <head>
     <title>A/B (Seamless) System Updates</title>
     <meta name="project_path" value="/_project.yaml" />
     <meta name="book_path" value="/_book.yaml" />
   </head>
   <body>
   <!--
       Copyright 2018 The Android Open Source Project

       Licensed under the Apache License, Version 2.0 (the "License");
       you may not use this file except in compliance with the License.
       You may obtain a copy of the License at

           http://www.apache.org/licenses/LICENSE-2.0

       Unless required by applicable law or agreed to in writing, software
       distributed under the License is distributed on an "AS IS" BASIS,
       WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
       See the License for the specific language governing permissions and
       limitations under the License.
   -->

     <p>
       A/B system updates, also known as seamless updates, ensure a workable
       booting system remains on the disk during an <a href="/devices/tech/ota/index.html">
       over-the-air (OTA) update</a>. This approach reduces the likelihood of
       an inactive device after an update, which means fewer device
       replacements and device reflashes at repair and warranty centers. Other
       commercial-grade operating systems such as
       <a href="https://www.chromium.org/chromium-os">ChromeOS</a> also use A/B
       updates successfully.
     </p>

     <p>
       For more information about A/B system updates and how they work, see
       <a href="#slots">Partition selection (slots)</a>.
     </p>

     <p>A/B system updates provide the following benefits:</p>

     <ul>
       <li>
         OTA updates can occur while the system is running, without
         interrupting the user. Users can continue to use their devices during
         an OTA&mdash;the only downtime during an update is when the device
         reboots into the updated disk partition.
       </li>
       <li>
         After an update, rebooting takes no longer than a regular reboot.
       </li>
       <li>
         If an OTA fails to apply (for example, because of a bad flash), the
         user will not be affected. The user will continue to run the old OS,
         and the client is free to re-attempt the update.
       </li>
       <li>
         If an OTA update is applied but fails to boot, the device will reboot
         back into the old partition and remains usable. The client is free to
         re-attempt the update.
       </li>
       <li>
         Any errors (such as I/O errors) affect only the <strong>unused</strong>
         partition set and can be retried. Such errors also become less likely
         because the I/O load is deliberately low to avoid degrading the user
         experience.
       </li>
       <li>
         Updates can be streamed to A/B devices, removing the need to download
         the package before installing it. Streaming means it's not necessary
         for the user to have enough free space to store the update package on
         <code>/data</code> or <code>/cache</code>.
       </li>
       <li>
         The cache partition is no longer used to store OTA update packages, so
         there is no need to ensure that the cache partition is large enough for
         future updates.
       </li>
       <li>
         <a href="/security/verifiedboot/dm-verity.html">dm-verity</a>
         guarantees a device will boot an uncorrupted image. If a device
         doesn't boot due to a bad OTA or dm-verity issue, the device can
         reboot into an old image. (Android <a href="/security/verifiedboot/">
         Verified Boot</a> does not require A/B updates.)
       </li>
     </ul>

     <h2 id="overview">About A/B system updates</h2>

       <p>
         A/B updates require changes to both the client and the system. The OTA
         package server, however, should not require changes: update packages
         are still served over HTTPS. For devices using Google's OTA
         infrastructure, the system changes are all in AOSP, and the client code
         is provided by Google Play services. OEMs not using Google's OTA
         infrastructure will be able to reuse the AOSP system code but will
         need to supply their own client.
       </p>

       <p>
         For OEMs supplying their own client, the client needs to:
       </p>

       <ul>
         <li>
           Decide when to take an update. Because A/B updates happen in the
           background, they are no longer user-initiated. To avoid disrupting
           users, it is recommended that updates are scheduled when the device
           is in idle maintenance mode, such as overnight, and on Wi-Fi.
           However, your client can use any heuristics you want.
         </li>
         <li>
           Check in with your OTA package servers and determine whether an
           update is available. This should be mostly the same as your existing
           client code, except that you will want to signal that the device
           supports A/B. (Google's client also includes a
           <strong>Check now</strong> button for users to check for the latest
           update.)
         </li>
         <li>
           Call <code>update_engine</code> with the HTTPS URL for your update
           package, assuming one is available. <code>update_engine</code> will
           update the raw blocks on the currently unused partition as it streams
           the update package.
         </li>
         <li>
           Report installation successes or failures to your servers, based on
           the <code>update_engine</code> result code. If the update is applied
           successfully, <code>update_engine</code> will tell the bootloader to
           boot into the new OS on the next reboot. The bootloader will fallback
           to the old OS if the new OS fails to boot, so no work is required
           from the client. If the update fails, the client needs to decide when
           (and whether) to try again, based on the detailed error code. For
           example, a good client could recognize that a partial ("diff") OTA
           package fails and try a full OTA package instead.
         </li>
       </ul>

       <p>Optionally, the client can:</p>

       <ul>
         <li>
           Show a notification asking the user to reboot. If you want to
           implement a policy where the user is encouraged to routinely update,
           then this notification can be added to your client. If the client
           does not prompt users, then users will get the update next time they
           reboot anyway. (Google's client has a per-update configurable delay.)
         </li>
         <li>
           Show a notification telling users whether they booted into a new
           OS version or whether they were expected to do so but fell back to
           the old OS version. (Google's client typically does neither.)
         </li>
       </ul>

       <p>On the system side, A/B system updates affect the following:</p>

       <ul>
         <li>
           Partition selection (slots), the <code>update_engine</code> daemon,
           and bootloader interactions (described below)
         </li>
         <li>
           Build process and OTA update package generation (described in
           <a href="/devices/tech/ota/ab/ab_implement.html">Implementing A/B
           Updates</a>)
         </li>
       </ul>

       <aside class="note">
         <strong>Note:</strong> A/B system updates implemented through OTA are
         recommended for new devices only.
       </aside>

       <h3 id="slots">Partition selection (slots)</h3>

         <p>
           A/B system updates use two sets of partitions referred to as
           <em>slots</em> (normally slot A and slot B). The system runs from
           the <em>current</em> slot while the partitions in the <em>unused</em>
           slot are not accessed by the running system during normal operation.
           This approach makes updates fault resistant by keeping the unused
           slot as a fallback: If an error occurs during or immediately after
           an update, the system can rollback to the old slot and continue to
           have a working system. To achieve this goal, no partition used by
           the <em>current</em> slot should be updated as part of the OTA
           update (including partitions for which there is only one copy).
         </p>

         <p>
           Each slot has a <em>bootable</em> attribute that states whether the
           slot contains a correct system from which the device can boot. The
           current slot is bootable when the system is running, but the other
           slot may have an old (still correct) version of the system, a newer
           version, or invalid data. Regardless of what the <em>current</em>
           slot is, there is one slot that is the <em>active</em> slot (the one
           the bootloader will boot form on the next boot) or the
           <em>preferred</em> slot.
         </p>

         <p>
           Each slot also has a <em>successful</em> attribute set by the user
           space, which is relevant only if the slot is also bootable. A
           successful slot should be able to boot, run, and update itself. A
           bootable slot that was not marked as successful (after several
           attempts were made to boot from it) should be marked as unbootable
           by the bootloader, including changing the active slot to another
           bootable slot (normally to the slot running immediately before the
           attempt to boot into the new, active one). The specific details of
           the interface are defined in
           <code><a href="https://android.googlesource.com/platform/hardware/libhardware/+/master/include/hardware/boot_control.h" class="external-link">
           boot_control.h</a></code>.
         </p>

       <h3 id="update-engine">Update engine daemon</h3>

         <p>
           A/B system updates use a background daemon called
           <code>update_engine</code> to prepare the system to boot into a new,
           updated version. This daemon can perform the following actions:
         </p>

         <ul>
           <li>
             Read from the current slot A/B partitions and write any data to
             the unused slot A/B partitions as instructed by the OTA package.
           </li>
           <li>
             Call the <code>boot_control</code> interface in a pre-defined
             workflow.
           </li>
           <li>
             Run a <em>post-install</em> program from the <em>new</em>
             partition after writing all the unused slot partitions, as
             instructed by the OTA package. (For details, see
             <a href="#post-installation">Post-installation</a>).
           </li>
         </ul>

         <p>
           As the <code>update_engine</code> daemon is not involved in the boot
           process itself, it is limited in what it can do during an update by
           the <a href="/security/selinux/">SELinux</a> policies and features
           in the <em>current</em> slot (such policies and features can't be
           updated until the system boots into a new version). To maintain a
           robust system, the update process <strong>should not</strong> modify
           the partition table, the contents of partitions in the current slot,
           or the contents of non-A/B partitions that can't be wiped with a
           factory reset.
         </p>

         <h4 id="update_engine_source">Update engine source</h4>

             <p>
               The <code>update_engine</code> source is located in
               <code><a href="https://android.googlesource.com/platform/system/update_engine/" class="external">system/update_engine</a></code>.
               The A/B OTA dexopt files are split between <code>installd</code> and
               a package manager:
             </p>

             <ul>
               <li>
                 <code><a href="https://android.googlesource.com/platform/frameworks/native/+/master/cmds/installd/" class="external-link">frameworks/native/cmds/installd/</a></code>ota*
                 includes the postinstall script, the binary for chroot, the
                 installd clone that calls dex2oat, the post-OTA move-artifacts
                 script, and the rc file for the move script.
               </li>
               <li>
                 <code><a href="https://android.googlesource.com/platform/frameworks/base/+/master/services/core/java/com/android/server/pm/OtaDexoptService.java" class="external-link">frameworks/base/services/core/java/com/android/server/pm/OtaDexoptService.java</a></code>
                 (plus <code><a href="https://android.googlesource.com/platform/frameworks/base/+/master/services/core/java/com/android/server/pm/OtaDexoptShellCommand.java" class="external-link">OtaDexoptShellCommand</a></code>)
                 is the package manager that prepares dex2oat commands for
                 applications.
               </li>
             </ul>

             <p>
               For a working example, refer to <code><a href="https://android.googlesource.com/device/google/marlin/+/nougat-dr1-release/device-common.mk" class="external-link">/device/google/marlin/device-common.mk</a></code>.
             </p>

         <h4 id="update_engine_logs">Update engine logs</h4>

           <p>
           For Android 8.x releases and earlier, the <code>update_engine</code>
           logs can be found in <code>logcat</code> and in the bug report. To
           make the <code>update_engine</code> logs available in the file system,
           patch the following changes into your build:
           </p>

           <ul>
             <li><a
                 href="https://android-review.googlesource.com/c/platform/system/update_engine/+/486618">
                 Change 486618</a></li>
             <li><a
                 href="https://android-review.googlesource.com/c/platform/system/core/+/529080">
                 Change 529080</a></li>
             <li><a
                 href="https://android-review.googlesource.com/c/platform/system/update_engine/+/529081">
                 Change 529081</a></li>
             <li><a
                 href="https://android-review.googlesource.com/c/platform/system/sepolicy/+/534660">
                 Change 534660</a></li>
             <li><a
                 href="https://android-review.googlesource.com/c/platform/system/update_engine/+/594637">
                 Change 594637</a></li>
           </ul>

           <p>These changes save a copy of the most recent
           <code>update_engine</code> log to
           <code>/data/misc/update_engine_log/update_engine.<var>YEAR</var>-<var>TIME</var></code>.
           In addition to the current log, the five most recent logs are saved
           under <code>/data/misc/update_engine_log/</code>. Users
           with the <strong>log</strong> group ID will be able to access the
           file system logs.</p>

       <h3 id="bootloader-interactions">Bootloader interactions</h3>

         <p>
           The <code>boot_control</code> HAL is used by
           <code>update_engine</code> (and possibly other daemons) to instruct
           the bootloader what to boot from. Common example scenarios and their
           associated states include the following:
         </p>

         <ul>
           <li>
             <strong>Normal case</strong>: The system is running from its
             current slot, either slot A or B. No updates have been applied so
             far. The system's current slot is bootable, successful, and the
             active slot.
           </li>
           <li>
             <strong>Update in progress</strong>: The system is running from
             slot B, so slot B is the bootable, successful, and active slot.
             Slot A was marked as unbootable since the contents of slot A are
             being updated but not yet completed. A reboot in this state should
             continue booting from slot B.
           </li>
           <li>
             <strong>Update applied, reboot pending</strong>: The system is
             running from slot B, slot B is bootable and successful, but slot A
             was marked as active (and therefore is marked as bootable). Slot A
             is not yet marked as successful and some number of attempts to
             boot from slot A should be made by the bootloader.
           </li>
           <li>
             <strong>System rebooted into new update</strong>: The system is
             running from slot A for the first time, slot B is still bootable
             and successful while slot A is only bootable, and still active but
             not successful. A user space daemon, <code>update_verifier</code>,
             should mark slot A as successful after some checks are made.
           </li>
         </ul>

       <h3 id="streaming-updates">Streaming update support</h3>

         <p>
           User devices don't always have enough space on <code>/data</code> to
           download the update package. As neither OEMs nor users want to waste
           space on a <code>/cache</code> partition, some users go without
           updates because the device has nowhere to store the update package.
           To address this issue, Android 8.0 added support for streaming A/B
           updates that write blocks directly to the B partition as they are
           downloaded, without having to store the blocks on <code>/data</code>.
           Streaming A/B updates need almost no temporary storage and require
           just enough storage for roughly 100 KiB of metadata.
         </p>

         <p>To enable streaming updates in Android 7.1, cherrypick the following
         patches:</p>

         <ul>
           <li>
             <a href="https://android-review.googlesource.com/333624" class="external">
             Allow to cancel a proxy resolution request</a>
           </li>
           <li>
             <a href="https://android-review.googlesource.com/333625" class="external">
             Fix terminating a transfer while resolving proxies</a>
           </li>
           <li>
             <a href="https://android-review.googlesource.com/333626" class="external">
             Add unit test for TerminateTransfer between ranges</a>
           </li>
           <li>
             <a href="https://android-review.googlesource.com/333627" class="external">
             Cleanup the RetryTimeoutCallback()</a>
           </li>
         </ul>

         <p>
           These patches are required to support streaming A/B updates in
           Android 7.1 and later whether using
           <a href="https://www.android.com/gms/">Google Mobile Services
           (GMS)</a> or any other update client.
         </p>

     <h2 id="life-of-an-a-b-update">Life of an A/B update</h2>

       <p>
         The update process starts when an OTA package (referred to in code as a
         <em>payload</em>) is available for downloading. Policies in the device
         may defer the payload download and application based on battery level,
         user activity, charging status, or other policies. In addition,
         because the update runs in the background, users might not know an
         update is in progress. All of this means the update process might be
         interrupted at any point due to policies, unexpected reboots, or user
         actions.
       </p>

       <p>
         Optionally, metadata in the OTA package itself indicates the update
         can be streamed; the same package can also be used for non-streaming
         installation. The server may use the metadata to tell the client it's
         streaming so the client will hand off the OTA to
         <code>update_engine</code> correctly. Device manufacturers with their
         own server and client can enable streaming updates by ensuring the
         server identifies the update is streaming (or assumes all updates are
         streaming) and the client makes the correct call to
         <code>update_engine</code> for streaming. Manufacturers can use the
         fact that the package is of the streaming variant to send a flag to
         the client to trigger hand off to the framework side as streaming.
       </p>

       <p>After a payload is available, the update process is as follows:</p>

       <table>
         <tr>
           <th>Step</th>
           <th>Activities</th>
         </tr>
         <tr>
           <td>1</td>
           <td>The current slot (or "source slot") is marked as successful (if
             not already marked) with <code>markBootSuccessful()</code>.</td>
         </tr>
         <tr>
           <td>2</td>
           <td>
             The unused slot (or "target slot") is marked as unbootable by
             calling the function <code>setSlotAsUnbootable()</code>. The
             current slot is always marked as successful at the beginning of
             the update to prevent the bootloader from falling back to the
             unused slot, which will soon have invalid data. If the system has
             reached the point where it can start applying an update, the
             current slot is marked as successful even if other major
             components are broken (such as the UI in a crash loop) as it is
             possible to push new software to fix these problems.
             <br /><br />
             The update payload is an opaque blob with the instructions to
             update to the new version. The update payload consists of the
             following:
             <ul>
               <li>
                 <em>Metadata</em>. A relatively small portion of the update
                 payload, the metadata contains a list of operations to produce
                 and verify the new version on the target slot. For example, an
                 operation could decompress a certain blob and write it to
                 specific blocks in a target partition, or read from a source
                 partition, apply a binary patch, and write to certain blocks
                 in a target partition.
               </li>
               <li>
                 <em>Extra data</em>. As the bulk of the update payload, the
                 extra data associated with the operations consists of the
                 compressed blob or binary patch in these examples.
               </li>
             </ul>
           </td>
         </tr>
         <tr>
           <td>3</td>
           <td>The payload metadata is downloaded.</td>
         </tr>
         <tr>
           <td>4</td>
           <td>
             For each operation defined in the metadata, in order, the
             associated data (if any) is downloaded to memory, the operation is
             applied, and the associated memory is discarded.
           </td>
         </tr>
         <tr>
           <td>5</td>
           <td>
             The whole partitions are re-read and verified against the expected
             hash.
           </td>
         </tr>
         <tr>
           <td>6</td>
           <td>
             The post-install step (if any) is run. In the case of an error
             during the execution of any step, the update fails and is
             re-attempted with possibly a different payload. If all the steps
             so far have succeeded, the update succeeds and the last step is
             executed.
           </td>
         </tr>
         <tr>
           <td>7</td>
           <td>
             The <em>unused slot</em> is marked as active by calling
             <code>setActiveBootSlot()</code>. Marking the unused slot as
             active doesn't mean it will finish booting. The bootloader (or
             system itself) can switch the active slot back if it doesn't read
             a successful state.
           </td>
         </tr>
         <tr>
           <td>8</td>
           <td>
             Post-installation (described below) involves running a program
             from the "new update" version while still running in the old
             version. If defined in the OTA package, this step is
             <strong>mandatory</strong> and the program must return with exit
             code <code>0</code>; otherwise, the update fails.
           </td>
         </tr>
           <td>9</td>
           <td>
             After the system successfully boots far enough into the new slot
             and finishes the post-reboot checks, the now current slot
             (formerly the "target slot") is marked as successful by calling
             <code>markBootSuccessful()</code>.
           </td>
         <tr>
       </table>

       <aside class="note">
         <strong>Note:</strong> Steps 3 and 4 take most of the update time as
         they involve writing and downloading large amounts of data, and are
         likely to be interrupted for reasons of policy or reboot.
       </aside>

       <h3 id="post-installation">Post-installation</h3>

         <p>
           For every partition where a post-install step is defined,
           <code>update_engine</code> mounts the new partition into a specific
           location and executes the program specified in the OTA relative to
           the mounted partition. For example, if the post-install program is
           defined as <code>usr/bin/postinstall</code> in the system partition,
           this partition from the unused slot will be mounted in a fixed
           location (such as <code>/postinstall_mount</code>) and the
           <code>/postinstall_mount/usr/bin/postinstall</code> command is
           executed.
         </p>

         <p>
           For post-installation to succeed, the old kernel must be able to:
         </p>

         <ul>
           <li>
             <strong>Mount the new filesystem format</strong>. The filesystem
             type cannot change unless there's support for it in the old
             kernel, including details such as the compression algorithm used
             if using a compressed filesystem (i.e. SquashFS).
           </li>
           <li>
             <strong>Understand the new partition's post-install program format</strong>.
             If using an Executable and Linkable Format (ELF) binary, it should
             be compatible with the old kernel (e.g. a 64-bit new program
             running on an old 32-bit kernel if the architecture switched from
             32- to 64-bit builds). Unless the loader (<code>ld</code>) is
             instructed to use other paths or build a static binary, libraries
             will be loaded from the old system image and not the new one.
           </li>
         </ul>

         <p>
           For example, you could use a shell script as a post-install program
           interpreted by the old system's shell binary with a <code>#!</code>
           marker at the top), then set up library paths from the new
           environment for executing a more complex binary post-install
           program. Alternatively, you could run the post-install step from a
           dedicated smaller partition to enable the filesystem format in the
           main system partition to be updated without incurring backward
           compatibility issues or stepping-stone updates; this would allow
           users to update directly to the latest version from a factory image.
         </p>

         <p>
           The new post-install program is limited by the SELinux policies
           defined in the old system. As such, the post-install step is
           suitable for performing tasks required by design on a given device
           or other best-effort tasks (i.e. updating the A/B-capable firmware
           or bootloader, preparing copies of databases for the new version,
           etc.). The post-install step is <strong>not suitable</strong> for
           one-off bug fixes before reboot that require unforeseen permissions.
         </p>

         <p>
           The selected post-install program runs in the
           <code>postinstall</code> SELinux context. All the files in the new
           mounted partition will be tagged with <code>postinstall_file</code>,
           regardless of what their attributes are after rebooting into that
           new system. Changes to the SELinux attributes in the new system
           won't impact the post-install step. If the post-install program
           needs extra permissions, those must be added to the post-install
           context.
         </p>

       <h3 id="after_reboot">After reboot</h3>

         <p>
           After rebooting, <code>update_verifier</code> triggers the integrity
           check using dm-verity. This check starts before zygote to avoid Java
           services making any irreversible changes that would prevent a safe
           rollback. During this process, bootloader and kernel may also
           trigger a reboot if verified boot or dm-verity detect any
           corruption. After the check completes, <code>update_verifier</code>
           marks the boot successful.
         </p>

         <p>
           <code>update_verifier</code> will read only the blocks listed in
           <code>/data/ota_package/care_map.txt</code>, which is included in an
           A/B OTA package when using the AOSP code. The Java system update
           client, such as GmsCore, extracts <code>care_map.txt</code>, sets up
           the access permission before rebooting the device, and deletes the
           extracted file after the system successfully boots into the new
           version.
         </p>

   </body>
 </html>