Bug: 195428035

Clone this repo:
  1. 11a3a4c Merge remote-tracking branch 'aosp/upstream-main' am: a2c45bea23 am: 0d1306182f am: dd0b6e3bf2 am: 1de48bf84e by Kelvin Zhang · 2 years ago android13-d2-release android13-d3-s1-release android13-d4-release android13-d4-s1-release android13-d4-s2-release android13-dev android13-frc-adbd-release android13-frc-art-release android13-frc-cellbroadcast-release android13-frc-conscrypt-release android13-frc-documentsui-release android13-frc-extservices-release android13-frc-ipsec-release android13-frc-media-release android13-frc-media-swcodec-release android13-frc-networking-release android13-frc-neuralnetworks-release android13-frc-odp-release android13-frc-os-statsd-release android13-frc-permission-release android13-frc-resolv-release android13-frc-scheduling-release android13-mainline-adbd-release android13-mainline-adservices-release android13-mainline-appsearch-release android13-mainline-art-release android13-mainline-cellbroadcast-release android13-mainline-conscrypt-release android13-mainline-extservices-release android13-mainline-go-adbd-release android13-mainline-go-adservices-release android13-mainline-go-appsearch-release android13-mainline-go-art-release android13-mainline-go-cellbroadcast-release android13-mainline-go-conscrypt-release android13-mainline-go-documentsui-release android13-mainline-go-extservices-release android13-mainline-go-ipsec-release android13-mainline-go-media-release android13-mainline-go-media-swcodec-release android13-mainline-go-mediaprovider-release android13-mainline-go-networking-release android13-mainline-go-neuralnetworks-release android13-mainline-go-odp-release android13-mainline-go-os-statsd-release android13-mainline-go-permission-release android13-mainline-go-resolv-release android13-mainline-go-scheduling-release android13-mainline-go-sdkext-release android13-mainline-go-tethering-release android13-mainline-go-tzdata4-release android13-mainline-go-uwb-release android13-mainline-go-wifi-release android13-mainline-ipsec-release android13-mainline-media-release android13-mainline-media-swcodec-release android13-mainline-mediaprovider-release android13-mainline-networking-release android13-mainline-os-statsd-release android13-mainline-permission-release android13-mainline-resolv-release android13-mainline-scheduling-release android13-mainline-sdkext-release android13-mainline-tethering-release android13-mainline-tzdata4-release android13-mainline-uwb-release android13-mainline-wifi-release android13-qpr1-release android13-qpr1-s1-release android13-qpr1-s2-release android13-qpr1-s3-release android13-qpr1-s4-release android13-qpr1-s5-release android13-qpr1-s6-release android13-qpr1-s7-release android13-qpr1-s8-release android13-qpr2-b-s1-release android13-qpr2-release android13-qpr2-s1-release android13-qpr2-s10-release android13-qpr2-s11-release android13-qpr2-s12-release android13-qpr2-s2-release android13-qpr2-s3-release android13-qpr2-s5-release android13-qpr2-s6-release android13-qpr2-s7-release android13-qpr2-s8-release android13-qpr2-s9-release android13-qpr3-c-s1-release android13-qpr3-c-s10-release android13-qpr3-c-s11-release android13-qpr3-c-s12-release android13-qpr3-c-s2-release android13-qpr3-c-s3-release android13-qpr3-c-s4-release android13-qpr3-c-s5-release android13-qpr3-c-s6-release android13-qpr3-c-s7-release android13-qpr3-c-s8-release android13-qpr3-release android13-qpr3-s1-release android13-qpr3-s10-release android13-qpr3-s11-release android13-qpr3-s12-release android13-qpr3-s13-release android13-qpr3-s14-release android13-qpr3-s2-release android13-qpr3-s3-release android13-qpr3-s4-release android13-qpr3-s5-release android13-qpr3-s6-release android13-qpr3-s7-release android13-qpr3-s8-release android13-qpr3-s9-release android14-d1-release android14-d1-s1-release android14-d1-s2-release android14-d1-s3-release android14-d1-s4-release android14-d1-s5-release android14-d1-s6-release android14-d1-s7-release android14-dev android14-gsi android14-platform-release android14-release android14-s1-release android14-s2-release android14-security-release android14-tests-release main main-16k main-16k-with-phones master aml_adb_331011040 aml_adb_331011050 aml_adb_331113120 aml_adb_331314020 aml_adb_331610000 aml_adb_340912000 aml_ads_331131000 aml_ads_331418080 aml_ads_331511020 aml_ads_331611190 aml_ads_331710270 aml_ads_331814200 aml_ads_331920180 aml_ads_340915050 aml_ads_341027030 aml_art_331012050 aml_art_331113000 aml_art_331314010 aml_art_331413030 aml_art_331612010 aml_art_331711080 aml_art_331813010 aml_art_331813100 aml_art_340915060 aml_art_341010050 aml_ase_331011020 aml_ase_331112000 aml_ase_331311020 aml_ase_340913000 aml_cbr_330810000 aml_cbr_330911010 aml_cbr_331013010 aml_cbr_331111030 aml_cbr_331310010 aml_cbr_331411000 aml_cbr_331510000 aml_cbr_331610010 aml_cbr_331710020 aml_cbr_331810000 aml_cbr_331910000 aml_cbr_340914000 aml_cbr_341011000 aml_con_331011010 aml_con_331115000 aml_con_331312000 aml_con_331411000 aml_con_331413000 aml_doc_331120000 aml_doc_340916000 aml_doc_341012000 aml_ext_331012020 aml_ext_331112010 aml_ext_331312000 aml_ext_331412000 aml_ext_331814220 aml_ext_341027030 aml_go_adb_330913000 aml_go_ads_330913000 aml_go_ads_330915000 aml_go_ads_330915100 aml_go_art_330913000 aml_go_ase_330913000 aml_go_cbr_330912000 aml_go_con_330913000 aml_go_doc_330912000 aml_go_ext_330912000 aml_go_ips_330911000 aml_go_med_330913000 aml_go_mpr_330912000 aml_go_net_330913000 aml_go_neu_330912000 aml_go_odp_330912000 aml_go_odp_330913000 aml_go_per_330912000 aml_go_res_330912000 aml_go_sch_330911000 aml_go_sdk_330810000 aml_go_sta_330911000 aml_go_swc_330913000 aml_go_tet_330914010 aml_go_tz4_330912000 aml_go_uwb_330912000 aml_go_wif_330911000 aml_ips_331014020 aml_ips_331111030 aml_ips_331310000 aml_ips_331312000 aml_ips_331910010 aml_ips_340914000 aml_med_331012020 aml_med_331115000 aml_med_331318000 aml_med_331410000 aml_med_331511000 aml_med_331612000 aml_med_331712010 aml_med_331911000 aml_med_340922010 aml_med_341011000 aml_mpr_330811020 aml_mpr_330911040 aml_mpr_331011070 aml_mpr_331112030 aml_mpr_331112050 aml_mpr_331311080 aml_mpr_331412040 aml_mpr_331512020 aml_mpr_331613010 aml_mpr_331711020 aml_mpr_331812020 aml_mpr_331918000 aml_mpr_340919000 aml_mpr_341015030 aml_net_330811010 aml_net_330910010 aml_net_331011030 aml_net_331110020 aml_net_331313010 aml_net_331313030 aml_net_331412000 aml_net_331610000 aml_net_331710000 aml_net_331812010 aml_net_331910030 aml_net_340913000 aml_net_341014000 aml_neu_331113000 aml_neu_331310000 aml_per_330811030 aml_per_330912010 aml_per_331019040 aml_per_331115020 aml_per_331313010 aml_per_331411000 aml_per_331512020 aml_per_331611010 aml_per_331710050 aml_per_331812030 aml_per_331913010 aml_per_340916010 aml_per_341011020 aml_res_330810000 aml_res_330910000 aml_res_331011050 aml_res_331114000 aml_res_331314010 aml_res_331512000 aml_res_331611010 aml_res_331820000 aml_res_340912000 aml_rkp_341012000 aml_rkp_341015010 aml_sch_331111000 aml_sch_331113000 aml_sdk_330810010 aml_sdk_330810050 aml_sdk_331111000 aml_sdk_331310010 aml_sdk_331410000 aml_sdk_331412000 aml_sdk_331811000 aml_sdk_331811100 aml_sdk_331812000 aml_sdk_340912010 aml_sdk_341010000 aml_sta_330910000 aml_sta_331010010 aml_sta_331311000 aml_sta_331410000 aml_sta_331511000 aml_sta_331610000 aml_sta_331711010 aml_sta_331811000 aml_sta_331910000 aml_sta_340911000 aml_sta_340912000 aml_sta_341010020 aml_swc_331012020 aml_swc_331116000 aml_swc_331318000 aml_swc_331410000 aml_swc_331511000 aml_swc_331612000 aml_swc_331712000 aml_swc_331911000 aml_swc_340922010 aml_swc_341011020 aml_tet_330812150 aml_tet_330911010 aml_tet_331012080 aml_tet_331117000 aml_tet_331312080 aml_tet_331412030 aml_tet_331511000 aml_tet_331511160 aml_tet_331711040 aml_tet_331820050 aml_tet_331910040 aml_tet_340913030 aml_tet_341010040 aml_tz4_331012000 aml_tz4_331012040 aml_tz4_331012050 aml_tz4_331314010 aml_tz4_331314020 aml_tz4_331314030 aml_tz4_331910000 aml_uwb_330810010 aml_uwb_331015040 aml_uwb_331115000 aml_uwb_331310030 aml_uwb_331410010 aml_uwb_331611010 aml_uwb_331613010 aml_uwb_331820070 aml_uwb_331910010 aml_uwb_341011000 aml_wif_330810040 aml_wif_330910030 aml_wif_331016070 aml_wif_331112000 aml_wif_331310070 aml_wif_331414000 aml_wif_331511020 aml_wif_331613000 aml_wif_331710030 aml_wif_331810010 aml_wif_331910020 aml_wif_340913010 aml_wif_341011010 android-13.0.0_r16 android-13.0.0_r17 android-13.0.0_r18 android-13.0.0_r19 android-13.0.0_r20 android-13.0.0_r21 android-13.0.0_r22 android-13.0.0_r23 android-13.0.0_r24 android-13.0.0_r27 android-13.0.0_r28 android-13.0.0_r29 android-13.0.0_r30 android-13.0.0_r32 android-13.0.0_r33 android-13.0.0_r34 android-13.0.0_r35 android-13.0.0_r36 android-13.0.0_r37 android-13.0.0_r38 android-13.0.0_r39 android-13.0.0_r40 android-13.0.0_r41 android-13.0.0_r42 android-13.0.0_r43 android-13.0.0_r44 android-13.0.0_r45 android-13.0.0_r46 android-13.0.0_r47 android-13.0.0_r48 android-13.0.0_r49 android-13.0.0_r50 android-13.0.0_r51 android-13.0.0_r52 android-13.0.0_r53 android-13.0.0_r54 android-13.0.0_r55 android-13.0.0_r56 android-13.0.0_r57 android-13.0.0_r58 android-13.0.0_r59 android-13.0.0_r60 android-13.0.0_r61 android-13.0.0_r62 android-13.0.0_r63 android-13.0.0_r64 android-13.0.0_r65 android-13.0.0_r66 android-13.0.0_r67 android-13.0.0_r68 android-13.0.0_r69 android-13.0.0_r70 android-13.0.0_r71 android-13.0.0_r72 android-13.0.0_r73 android-13.0.0_r74 android-13.0.0_r75 android-13.0.0_r76 android-13.0.0_r77 android-13.0.0_r78 android-13.0.0_r79 android-13.0.0_r80 android-13.0.0_r81 android-13.0.0_r82 android-13.0.0_r83 android-14.0.0_r1 android-14.0.0_r10 android-14.0.0_r11 android-14.0.0_r12 android-14.0.0_r13 android-14.0.0_r14 android-14.0.0_r15 android-14.0.0_r2 android-14.0.0_r3 android-14.0.0_r4 android-14.0.0_r5 android-14.0.0_r6 android-14.0.0_r7 android-14.0.0_r8 android-14.0.0_r9 android-cts-14.0_r1 android-platform-14.0.0_r1 android-security-14.0.0_r1 android-security-14.0.0_r2 android-u-beta-1-gpl android-vts-14.0_r1 frc_340818110 frc_340818170 frc_340819020 frc_340819190 frc_340821000 t_frc_adb_330444000 t_frc_art_330443060 t_frc_ase_330444010 t_frc_cbr_330443000 t_frc_con_330443020 t_frc_doc_330443000 t_frc_doc_330443060 t_frc_doc_330543000 t_frc_ext_330443000 t_frc_ips_330443010 t_frc_med_330443030 t_frc_net_330443000 t_frc_neu_330443000 t_frc_neu_330443030 t_frc_odp_330442000 t_frc_odp_330442040 t_frc_per_330444010 t_frc_res_330443000 t_frc_sch_330443010 t_frc_sch_330443040 t_frc_sta_330443010 t_frc_swc_330443010 t_frc_swc_330443040 t_frc_tz4_330443010
  2. 1de48bf Merge remote-tracking branch 'aosp/upstream-main' am: a2c45bea23 am: 0d1306182f am: dd0b6e3bf2 by Kelvin Zhang · 2 years ago
  3. dd0b6e3 Merge remote-tracking branch 'aosp/upstream-main' am: a2c45bea23 am: 0d1306182f by Kelvin Zhang · 2 years ago
  4. 0d13061 Merge remote-tracking branch 'aosp/upstream-main' am: a2c45bea23 by Kelvin Zhang · 2 years ago
  5. a2c45be Merge remote-tracking branch 'aosp/upstream-main' by Kelvin Zhang · 2 years ago

Basic Definitions for Patching

Binary: Executable image and data. Binaries may persist in an archive (e.g., chrome.7z), and need to be periodically updated. Formats for binaries include {PE files EXE / DLL, ELF, DEX}. Architectures binaries include {x86, x64, ARM, AArch64, Dalvik}. A binary is also referred to as an executable or an image file.

Patching: Sending a “new” file to clients who have an “old” file by computing and transmitting a “patch” that can be used to transform “old” into “new”. Patches are compressed for transmission. A key performance metric is patch size, which refers to the size of compressed patch file. For our experiments we use 7z.

Patch generation: Computation of a “patch” from “old” and “new”. This can be expensive (e.g., ~15-20 min for Chrome, using 1 GB of RAM), but since patch generation is a run-once step on the server-side when releasing “new” binaries, the expense is not too critical.

Patch application: Transformation from “old” binaries to “new”, using a (downloaded) “patch”. This is executed on client side on updates, so resource constraints (e.g., time, RAM, disk space) is more stringent. Also, fault- tolerance is important. This is usually achieved by an update system by having a fallback method of directly downloading “new” in case of patching failure.

Offset: Position relative to the start of a file.

Local offset: An offset relative to the start of a region of a file.

Element: A region in a file with associated executable type, represented by the tuple (exe_type, offset, length). Every Element in new file is associated with an Element in old file and patched independently.

Reference: A directed connection between two offsets in a binary. For example, consider jump instructions in x86:

00401000: E9 3D 00 00 00     jmp         00401042

Here, the 4 bytes [3D 00 00 00] starting at address 00401001 point to address 00401042 in memory. This forms a reference from offset(00401001) (length 4) to offset(00401042), where offset(addr) indicates the disk offset corresponding to addr. A reference has a location, length (implicitly determined by reference type), body, and target.

Location: The starting offset of bytes that store a reference. In the preceding example, offset(00401001) is a location. Each location is the beginning of a reference body.

Body: The span of bytes that encodes reference data, i.e., [location, location + length) = [location, location + 1, ..., location + length - 1]. In the preceding example, length = 4, so the reference body is [00401001, 00401001 + 4) = [00401001, 00401002, 00401003, 00401004]. All reference bodies in an image must not overlap, and often regions boundaries are required to not straddle a reference body.

Target: The offset that's the destination of a reference. In the preceding example, offset(00401042) is the target. Different references can share common targets. For example, in

00401000: E9 3D 00 00 00     jmp         00401042
00401005: EB 3B              jmp         00401042

we have two references with different locations and bodies, but same target of 00401042.

Because the bytes that encode a reference depend on its target, and potentially on its location, they are more likely to get modified from an old version of a binary to a newer version. This is why “naive” patching does not do well on binaries.

Target Key: An alternative representation of a Target for a fixed pool, as its index in the sorted list of Target offsets. Keys are useful since:

  • Their numerical index are smaller than offsets, allowing more efficient storage of target correction data in patch.
  • They simplify association from Targets to Labels.

Disassembler: Architecture specific data and operations, used to extract and correct references in a binary.

Type of reference: The type of a reference determines the binary representation used to encode its target. This affects how references are parsed and written by a disassembler. There can be many types of references in the same binary.

A reference is represented by the tuple (disassembler, location, target, type). This tuple contains sufficient information to write the reference in a binary.

Pool of targets: Collection of targets that is assumed to have some semantic relationship. Each reference type belong to exactly one reference pool. Targets for references in the same pool are shared.

For example, the following describes two pools defined for Dalvik Executable format (DEX). Both pools spawn multiple types of references.

  1. Index in string table.
  • From bytecode to string index using 16 bits.
  • From bytecode to string index using 32 bits.
  • From field item to string index using 32 bits.
  1. Address in code.
  • Relative 16 bits pointer.
  • Relative 32 bits pointer.

Boundaries between different pools can be ambiguous. Having all targets belong to the same pool can reduce redundancy, but will use more memory and might cause larger corrections to happen, so this is a trade-off that can be resolved with benchmarks.

Abs32 references: References whose targets are adjusted by the OS during program load. In an image, a relocation table typically provides locations of abs32 references. At each abs32 location, the stored bytes then encode semantic information about the target (e.g., as RVA).

Rel32 references: References embedded within machine code, in which targets are encoded as some delta relative to the reference's location. Typical examples of rel32 references are branching instructions and instruction pointer-relative memory access.

Equivalence: A (src_offset, dst_offset, length) tuple describing a region of “old” binary, at an offset of |src_offset|, that is similar to a region of “new” binary, at an offset of |dst_offset|.

Raw delta unit: Describes a raw modification to apply on the new image, as a pair (copy_offset, diff), where copy_offset describes the position in new file as an offset in the data that was copied from the old file, and diff is the bytewise difference to apply.

Associated Targets: A target in “old” binary is associated with a target in “new” binary if both targets:

  1. are part of similar regions from the same equivalence, and
  2. have the same local offset (relative to respective start regions), and
  3. are not part of any larger region from a different equivalence. Not all targets are necessarily associated with another target.

Target Affinity: Level of confidence in the association between two targets. The affinity between targets that are potentially associated is measured based on surrounding content, as well as reference type.

Label: An integer assigned for each Target in “old” and “new” binary as part of generating a patch, and used to alias targets when searching for similar regions that will form equivalences. Labels are assigned such that associated targets in old and new binaries share the same Label. Unmatched Targets have a Label of 0. For example, given

  • “Old” targets = [0x1111, 0x3333, 0x5555, 0x7777],
  • “New” targets = [0x2222, 0x4444, 0x6666, 0x8888], to represent matchings 0x1111 <=> 0x6666, 0x3333 <=> 0x2222, we'd assign
  • Label 1 to 0x1111 (in “old”) and 0x6666 (in “new”),
  • Label 2 to 0x3333 (in “old”) and 0x2222 (in “new”). Represented as arrays indexed over Target Keys, we'd have:
  • “Old” labels = [1, 2, 0 ,0],
  • “New” labels = [2, 0, 1, 0].

Encoded Image: The result of projecting the content of an image to scalar values that describe content on a higher level of abstraction, masking away undesirable noise in raw content. Notably, the projection encodes references based on their associated label.

Interfaces

zucchini_lib: Core Zucchini library that operate on buffers to generate and apply patches.

zucchini_io: Wrapper on zucchini_lib that handles file I/O, using memory-mapped I/O to interface with zucchini_lib.

zucchini: Stand-alone executable that parses command-line arguments, and passes the results to zucchini_io. Also implements various helper flows.

Zucchini Ensemble Patch Format

Types

int8: 8-bit unsigned int.

uint32: 32-bit unsigned int, little-endian.

int32: 32-bit signed int, little-endian.

Varints: This is a generic variable-length encoding for integer quantities that strips away leading (most-significant) null bytes. The Varints format is borrowed from protocol-buffers, see documentation for more info.

varuint32: A uint32 encoded using Varints format.

varint32: A int32 encoded using Varints format.

File Layout

NameFormatDescription
headerPatchHeaderThe header.
elements_countuint32Number of patch units.
elementsPatchElement[elements_count]List of all patch elements.

Position of elements in new file is ascending.

Structures

PatchHeader

NameFormatDescription
magicuint32 = kMagicMagic value.
major_versionuint16Major version number indicating breaking changes.
minor_versionuint16Minor version number indicating possibly breaking changes.
old_sizeuint32Size of old file in bytes.
old_crcuint32CRC32 of old file.
new_sizeuint32Size of new file in bytes.
new_crcuint32CRC32 of new file.

kMagic == 'Z' | ('u' << 8) | ('c' << 16) | ('c' << 24)

PatchElement Contains all the information required to produce a single element in new file.

NameFormatDescription
headerPatchElementHeaderThe header.
equivalencesEquivalenceListList of equivalences.
raw_deltasRawDeltaListList of raw deltas.
reference_deltasReferenceDeltaListList of reference deltas.
pool_countuint32Number of pools.
extra_targetsExtraTargetList[pool_count]Lists of extra targets.

PatchElementHeader Describes a correspondence between an element in old and in new files. Some redundancy arise from storing |new_offset|, but it is necessary to make PatchElement self contained.

NameFormatDescription
old_offsetuint32Starting offset of the element in old file.
old_lengthuint32Length of the element in old file.
new_offsetuint32Starting offset of the element in new file.
new_lengthuint32Length of the element in new file.
exe_typeuint32Executable type for this unit, see enum ExecutableType.
versionuint16Version specific to the executable type for this unit.

EquivalenceList Encodes a list of equivalences, where dst offsets (in new image) are ascending.

NameFormatDescription
src_skipBufferSrc offset for each equivalence, delta encoded.
dst_skipBufferDst offset for each equivalence, delta encoded.
copy_countBufferLength for each equivalence.

RawDeltaList Encodes a list of raw delta units, with ascending copy offsets.

NameFormatDescription
raw_delta_skipBufferCopy offset for each delta unit, delta encoded and biased by -1.
raw_delta_diffBufferBytewise difference for each delta unit.

ReferenceDeltaList Encodes a list of reference deltas, in the order they appear in the new image file. A reference delta is a signed integer representing a jump through a list of targets.

NameFormatDescription
reference_deltaBufferVector of reference deltas.

ExtraTargetList Encodes a list of additional targets in the new image file, in ascending order.

NameFormatDescription
pool_taguint8_tUnique identifier for this pool of targets.
extra_targetsBufferAdditional targets, delta encoded and biased by -1.

Buffer A generic vector of data.

NameFormatDescription
sizeuint32Size of content in bytes.
contentT[]List of integers.

Format Changelog

All breaking changes to zucchini patch format will be documented in this section.

The format is based on Keep a Changelog.

[Unreleased]

[1.0] - 2021-10-27

Added

Major/Minor version is encoded in PatchHeader Disassembler version associated with an element version is encoded in PatchElementHeader.