|  | ======================================================= | 
|  | Hardware-assisted AddressSanitizer Design Documentation | 
|  | ======================================================= | 
|  |  | 
|  | This page is a design document for | 
|  | **hardware-assisted AddressSanitizer** (or **HWASAN**) | 
|  | a tool similar to :doc:`AddressSanitizer`, | 
|  | but based on partial hardware assistance. | 
|  |  | 
|  |  | 
|  | Introduction | 
|  | ============ | 
|  |  | 
|  | :doc:`AddressSanitizer` | 
|  | tags every 8 bytes of the application memory with a 1 byte tag (using *shadow memory*), | 
|  | uses *redzones* to find buffer-overflows and | 
|  | *quarantine* to find use-after-free. | 
|  | The redzones, the quarantine, and, to a less extent, the shadow, are the | 
|  | sources of AddressSanitizer's memory overhead. | 
|  | See the `AddressSanitizer paper`_ for details. | 
|  |  | 
|  | AArch64 has the `Address Tagging`_ (or top-byte-ignore, TBI), a hardware feature that allows | 
|  | software to use 8 most significant bits of a 64-bit pointer as | 
|  | a tag. HWASAN uses `Address Tagging`_ | 
|  | to implement a memory safety tool, similar to :doc:`AddressSanitizer`, | 
|  | but with smaller memory overhead and slightly different (mostly better) | 
|  | accuracy guarantees. | 
|  |  | 
|  | Algorithm | 
|  | ========= | 
|  | * Every heap/stack/global memory object is forcibly aligned by `TG` bytes | 
|  | (`TG` is e.g. 16 or 64). We call `TG` the **tagging granularity**. | 
|  | * For every such object a random `TS`-bit tag `T` is chosen (`TS`, or tag size, is e.g. 4 or 8) | 
|  | * The pointer to the object is tagged with `T`. | 
|  | * The memory for the object is also tagged with `T` (using a `TG=>1` shadow memory) | 
|  | * Every load and store is instrumented to read the memory tag and compare it | 
|  | with the pointer tag, exception is raised on tag mismatch. | 
|  |  | 
|  | For a more detailed discussion of this approach see https://arxiv.org/pdf/1802.09517.pdf | 
|  |  | 
|  | Instrumentation | 
|  | =============== | 
|  |  | 
|  | Memory Accesses | 
|  | --------------- | 
|  | All memory accesses are prefixed with an inline instruction sequence that | 
|  | verifies the tags. Currently, the following sequence is used: | 
|  |  | 
|  |  | 
|  | .. code-block:: asm | 
|  |  | 
|  | // int foo(int *a) { return *a; } | 
|  | // clang -O2 --target=aarch64-linux -fsanitize=hwaddress -c load.c | 
|  | foo: | 
|  | 0:	08 dc 44 d3 	ubfx	x8, x0, #4, #52  // shadow address | 
|  | 4:	08 01 40 39 	ldrb	w8, [x8]         // load shadow | 
|  | 8:	09 fc 78 d3 	lsr	x9, x0, #56      // address tag | 
|  | c:	3f 01 08 6b 	cmp	w9, w8           // compare tags | 
|  | 10:	61 00 00 54 	b.ne	#12              // jump on mismatch | 
|  | 14:	00 00 40 b9 	ldr	w0, [x0]         // original load | 
|  | 18:	c0 03 5f d6 	ret | 
|  | 1c:	40 20 21 d4 	brk	#0x902           // trap | 
|  |  | 
|  |  | 
|  | Alternatively, memory accesses are prefixed with a function call. | 
|  |  | 
|  | Heap | 
|  | ---- | 
|  |  | 
|  | Tagging the heap memory/pointers is done by `malloc`. | 
|  | This can be based on any malloc that forces all objects to be TG-aligned. | 
|  | `free` tags the memory with a different tag. | 
|  |  | 
|  | Stack | 
|  | ----- | 
|  |  | 
|  | Stack frames are instrumented by aligning all non-promotable allocas | 
|  | by `TG` and tagging stack memory in function prologue and epilogue. | 
|  |  | 
|  | Tags for different allocas in one function are **not** generated | 
|  | independently; doing that in a function with `M` allocas would require | 
|  | maintaining `M` live stack pointers, significantly increasing register | 
|  | pressure. Instead we generate a single base tag value in the prologue, | 
|  | and build the tag for alloca number `M` as `ReTag(BaseTag, M)`, where | 
|  | ReTag can be as simple as exclusive-or with constant `M`. | 
|  |  | 
|  | Stack instrumentation is expected to be a major source of overhead, | 
|  | but could be optional. | 
|  |  | 
|  | Globals | 
|  | ------- | 
|  |  | 
|  | TODO: details. | 
|  |  | 
|  | Error reporting | 
|  | --------------- | 
|  |  | 
|  | Errors are generated by the `HLT` instruction and are handled by a signal handler. | 
|  |  | 
|  | Attribute | 
|  | --------- | 
|  |  | 
|  | HWASAN uses its own LLVM IR Attribute `sanitize_hwaddress` and a matching | 
|  | C function attribute. An alternative would be to re-use ASAN's attribute | 
|  | `sanitize_address`. The reasons to use a separate attribute are: | 
|  |  | 
|  | * Users may need to disable ASAN but not HWASAN, or vise versa, | 
|  | because the tools have different trade-offs and compatibility issues. | 
|  | * LLVM (ideally) does not use flags to decide which pass is being used, | 
|  | ASAN or HWASAN are being applied, based on the function attributes. | 
|  |  | 
|  | This does mean that users of HWASAN may need to add the new attribute | 
|  | to the code that already uses the old attribute. | 
|  |  | 
|  |  | 
|  | Comparison with AddressSanitizer | 
|  | ================================ | 
|  |  | 
|  | HWASAN: | 
|  | * Is less portable than :doc:`AddressSanitizer` | 
|  | as it relies on hardware `Address Tagging`_ (AArch64). | 
|  | Address Tagging can be emulated with compiler instrumentation, | 
|  | but it will require the instrumentation to remove the tags before | 
|  | any load or store, which is infeasible in any realistic environment | 
|  | that contains non-instrumented code. | 
|  | * May have compatibility problems if the target code uses higher | 
|  | pointer bits for other purposes. | 
|  | * May require changes in the OS kernels (e.g. Linux seems to dislike | 
|  | tagged pointers passed from address space: | 
|  | https://www.kernel.org/doc/Documentation/arm64/tagged-pointers.txt). | 
|  | * **Does not require redzones to detect buffer overflows**, | 
|  | but the buffer overflow detection is probabilistic, with roughly | 
|  | `(2**TS-1)/(2**TS)` probability of catching a bug. | 
|  | * **Does not require quarantine to detect heap-use-after-free, | 
|  | or stack-use-after-return**. | 
|  | The detection is similarly probabilistic. | 
|  |  | 
|  | The memory overhead of HWASAN is expected to be much smaller | 
|  | than that of AddressSanitizer: | 
|  | `1/TG` extra memory for the shadow | 
|  | and some overhead due to `TG`-aligning all objects. | 
|  |  | 
|  | Supported architectures | 
|  | ======================= | 
|  | HWASAN relies on `Address Tagging`_ which is only available on AArch64. | 
|  | For other 64-bit architectures it is possible to remove the address tags | 
|  | before every load and store by compiler instrumentation, but this variant | 
|  | will have limited deployability since not all of the code is | 
|  | typically instrumented. | 
|  |  | 
|  | The HWASAN's approach is not applicable to 32-bit architectures. | 
|  |  | 
|  |  | 
|  | Related Work | 
|  | ============ | 
|  | * `SPARC ADI`_ implements a similar tool mostly in hardware. | 
|  | * `Effective and Efficient Memory Protection Using Dynamic Tainting`_ discusses | 
|  | similar approaches ("lock & key"). | 
|  | * `Watchdog`_ discussed a heavier, but still somewhat similar | 
|  | "lock & key" approach. | 
|  | * *TODO: add more "related work" links. Suggestions are welcome.* | 
|  |  | 
|  |  | 
|  | .. _Watchdog: http://www.cis.upenn.edu/acg/papers/isca12_watchdog.pdf | 
|  | .. _Effective and Efficient Memory Protection Using Dynamic Tainting: https://www.cc.gatech.edu/~orso/papers/clause.doudalis.orso.prvulovic.pdf | 
|  | .. _SPARC ADI: https://lazytyped.blogspot.com/2017/09/getting-started-with-adi.html | 
|  | .. _AddressSanitizer paper: https://www.usenix.org/system/files/conference/atc12/atc12-final39.pdf | 
|  | .. _Address Tagging: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.den0024a/ch12s05s01.html | 
|  |  |