docs/binaryonly_fuzzing.md - platform/external/AFLplusplus - Git at Google

 # Fuzzing binary-only programs with afl++

   afl++, libfuzzer and others are great if you have the source code, and
   it allows for very fast and coverage guided fuzzing.

   However, if there is only the binary program and no source code available,
   then standard `afl-fuzz -n` (non-instrumented mode) is not effective.

   The following is a description of how these binaries can be fuzzed with afl++.


 ## TL;DR:

   qemu_mode in persistent mode is the fastest - if the stability is
   high enough. Otherwise try retrowrite, afl-dyninst and if these
   fail too then try standard qemu_mode with AFL_ENTRYPOINT to where you need it.

   If your target is a library use utils/afl_frida/.

   If your target is non-linux then use unicorn_mode/.


 ## QEMU

   Qemu is the "native" solution to the program.
   It is available in the ./qemu_mode/ directory and once compiled it can
   be accessed by the afl-fuzz -Q command line option.
   It is the easiest to use alternative and even works for cross-platform binaries.

   The speed decrease is at about 50%.
   However various options exist to increase the speed:
    - using AFL_ENTRYPOINT to move the forkserver entry to a later basic block in
      the binary (+5-10% speed)
    - using persistent mode [qemu_mode/README.persistent.md](../qemu_mode/README.persistent.md)
      this will result in 150-300% overall speed increase - so 3-8x the original
      qemu_mode speed!
    - using AFL_CODE_START/AFL_CODE_END to only instrument specific parts

   Note that there is also honggfuzz: [https://github.com/google/honggfuzz](https://github.com/google/honggfuzz)
   which now has a qemu_mode, but its performance is just 1.5% ...

   As it is included in afl++ this needs no URL.

   If you like to code a customized fuzzer without much work, we highly
   recommend to check out our sister project libafl which will support QEMU
   too:
   [https://github.com/AFLplusplus/LibAFL](https://github.com/AFLplusplus/LibAFL)


 ## AFL FRIDA

   In frida_mode you can fuzz binary-only targets easily like with QEMU,
   with the advantage that frida_mode also works on MacOS (both intel and M1).

   If you want to fuzz a binary-only library then you can fuzz it with
   frida-gum via utils/afl_frida/, you will have to write a harness to
   call the target function in the library, use afl-frida.c as a template.

   Both come with afl++ so this needs no URL.

   You can also perform remote fuzzing with frida, e.g. if you want to fuzz
   on iPhone or Android devices, for this you can use
   [https://github.com/ttdennis/fpicker/](https://github.com/ttdennis/fpicker/)
   as an intermediate that uses afl++ for fuzzing.

   If you like to code a customized fuzzer without much work, we highly
   recommend to check out our sister project libafl which supports Frida too:
   [https://github.com/AFLplusplus/LibAFL](https://github.com/AFLplusplus/LibAFL)
   Working examples already exist :-)


 ## WINE+QEMU

   Wine mode can run Win32 PE binaries with the QEMU instrumentation.
   It needs Wine, python3 and the pefile python package installed.

   As it is included in afl++ this needs no URL.


 ## UNICORN

   Unicorn is a fork of QEMU. The instrumentation is, therefore, very similar.
   In contrast to QEMU, Unicorn does not offer a full system or even userland
   emulation. Runtime environment and/or loaders have to be written from scratch,
   if needed. On top, block chaining has been removed. This means the speed boost
   introduced in  the patched QEMU Mode of afl++ cannot simply be ported over to
   Unicorn. For further information, check out [unicorn_mode/README.md](../unicorn_mode/README.md).

   As it is included in afl++ this needs no URL.


 ## AFL UNTRACER

    If you want to fuzz a binary-only shared library then you can fuzz it with
    utils/afl_untracer/, use afl-untracer.c as a template.
    It is slower than AFL FRIDA (see above).


 ## DYNINST

   Dyninst is a binary instrumentation framework similar to Pintool and
   Dynamorio (see far below). However whereas Pintool and Dynamorio work at
   runtime, dyninst instruments the target at load time, and then let it run -
   or save the  binary with the changes.
   This is great for some things, e.g. fuzzing, and not so effective for others,
   e.g. malware analysis.

   So what we can do with dyninst is taking every basic block, and put afl's
   instrumention code in there - and then save the binary.
   Afterwards we can just fuzz the newly saved target binary with afl-fuzz.
   Sounds great? It is. The issue though - it is a non-trivial problem to
   insert instructions, which change addresses in the process space, so that
   everything is still working afterwards. Hence more often than not binaries
   crash when they are run.

   The speed decrease is about 15-35%, depending on the optimization options
   used with afl-dyninst.

   So if Dyninst works, it is the best option available. Otherwise it just
   doesn't work well.

   [https://github.com/vanhauser-thc/afl-dyninst](https://github.com/vanhauser-thc/afl-dyninst)


 ## RETROWRITE, ZAFL, ... other binary rewriter

   If you have an x86/x86_64 binary that still has its symbols, is compiled
   with position independant code (PIC/PIE) and does not use most of the C++
   features then the retrowrite solution might be for you.
   It decompiles to ASM files which can then be instrumented with afl-gcc.

   It is at about 80-85% performance.

   [https://git.zephyr-software.com/opensrc/zafl](https://git.zephyr-software.com/opensrc/zafl)
   [https://github.com/HexHive/retrowrite](https://github.com/HexHive/retrowrite)


 ## MCSEMA

   Theoretically you can also decompile to llvm IR with mcsema, and then
   use llvm_mode to instrument the binary.
   Good luck with that.

   [https://github.com/lifting-bits/mcsema](https://github.com/lifting-bits/mcsema)


 ## INTEL-PT

   If you have a newer Intel CPU, you can make use of Intels processor trace.
   The big issue with Intel's PT is the small buffer size and the complex
   encoding of the debug information collected through PT.
   This makes the decoding very CPU intensive and hence slow.
   As a result, the overall speed decrease is about 70-90% (depending on
   the implementation and other factors).

   There are two afl intel-pt implementations:

   1. [https://github.com/junxzm1990/afl-pt](https://github.com/junxzm1990/afl-pt)
      => this needs Ubuntu 14.04.05 without any updates and the 4.4 kernel.

   2. [https://github.com/hunter-ht-2018/ptfuzzer](https://github.com/hunter-ht-2018/ptfuzzer)
      => this needs a 4.14 or 4.15 kernel. the "nopti" kernel boot option must
         be used. This one is faster than the other.

   Note that there is also honggfuzz: https://github.com/google/honggfuzz
   But its IPT performance is just 6%!


 ## CORESIGHT

   Coresight is ARM's answer to Intel's PT.
   There is no implementation so far which handles coresight and getting
   it working on an ARM Linux is very difficult due to custom kernel building
   on embedded systems is difficult. And finding one that has coresight in
   the ARM chip is difficult too.
   My guess is that it is slower than Qemu, but faster than Intel PT.

   If anyone finds any coresight implementation for afl please ping me: vh@thc.org


 ## PIN & DYNAMORIO

   Pintool and Dynamorio are dynamic instrumentation engines, and they can be
   used for getting basic block information at runtime.
   Pintool is only available for Intel x32/x64 on Linux, Mac OS and Windows,
   whereas Dynamorio is additionally available for ARM and AARCH64.
   Dynamorio is also 10x faster than Pintool.

   The big issue with Dynamorio (and therefore Pintool too) is speed.
   Dynamorio has a speed decrease of 98-99%
   Pintool has a speed decrease of 99.5%

   Hence Dynamorio is the option to go for if everything else fails, and Pintool
   only if Dynamorio fails too.

   Dynamorio solutions:
   * [https://github.com/vanhauser-thc/afl-dynamorio](https://github.com/vanhauser-thc/afl-dynamorio)
   * [https://github.com/mxmssh/drAFL](https://github.com/mxmssh/drAFL)
   * [https://github.com/googleprojectzero/winafl/](https://github.com/googleprojectzero/winafl/) <= very good but windows only

   Pintool solutions:
   * [https://github.com/vanhauser-thc/afl-pin](https://github.com/vanhauser-thc/afl-pin)
   * [https://github.com/mothran/aflpin](https://github.com/mothran/aflpin)
   * [https://github.com/spinpx/afl_pin_mode](https://github.com/spinpx/afl_pin_mode) <= only old Pintool version supported


 ## Non-AFL solutions

   There are many binary-only fuzzing frameworks.
   Some are great for CTFs but don't work with large binaries, others are very
   slow but have good path discovery, some are very hard to set-up ...

   * QSYM: [https://github.com/sslab-gatech/qsym](https://github.com/sslab-gatech/qsym)
   * Manticore: [https://github.com/trailofbits/manticore](https://github.com/trailofbits/manticore)
   * S2E: [https://github.com/S2E](https://github.com/S2E)
   * Tinyinst: [https://github.com/googleprojectzero/TinyInst](https://github.com/googleprojectzero/TinyInst) (Mac/Windows only)
   * Jackalope: [https://github.com/googleprojectzero/Jackalope](https://github.com/googleprojectzero/Jackalope)
   *  ... please send me any missing that are good


 ## Closing words

   That's it! News, corrections, updates? Send an email to vh@thc.org
	# Fuzzing binary-only programs with afl++

	afl++, libfuzzer and others are great if you have the source code, and
	it allows for very fast and coverage guided fuzzing.

	However, if there is only the binary program and no source code available,
	then standard `afl-fuzz -n` (non-instrumented mode) is not effective.

	The following is a description of how these binaries can be fuzzed with afl++.


	## TL;DR:

	qemu_mode in persistent mode is the fastest - if the stability is
	high enough. Otherwise try retrowrite, afl-dyninst and if these
	fail too then try standard qemu_mode with AFL_ENTRYPOINT to where you need it.

	If your target is a library use utils/afl_frida/.

	If your target is non-linux then use unicorn_mode/.


	## QEMU

	Qemu is the "native" solution to the program.
	It is available in the ./qemu_mode/ directory and once compiled it can
	be accessed by the afl-fuzz -Q command line option.
	It is the easiest to use alternative and even works for cross-platform binaries.

	The speed decrease is at about 50%.
	However various options exist to increase the speed:
	- using AFL_ENTRYPOINT to move the forkserver entry to a later basic block in
	the binary (+5-10% speed)
	- using persistent mode [qemu_mode/README.persistent.md](../qemu_mode/README.persistent.md)
	this will result in 150-300% overall speed increase - so 3-8x the original
	qemu_mode speed!
	- using AFL_CODE_START/AFL_CODE_END to only instrument specific parts

	Note that there is also honggfuzz: [https://github.com/google/honggfuzz](https://github.com/google/honggfuzz)
	which now has a qemu_mode, but its performance is just 1.5% ...

	As it is included in afl++ this needs no URL.

	If you like to code a customized fuzzer without much work, we highly
	recommend to check out our sister project libafl which will support QEMU
	too:
	[https://github.com/AFLplusplus/LibAFL](https://github.com/AFLplusplus/LibAFL)


	## AFL FRIDA

	In frida_mode you can fuzz binary-only targets easily like with QEMU,
	with the advantage that frida_mode also works on MacOS (both intel and M1).

	If you want to fuzz a binary-only library then you can fuzz it with
	frida-gum via utils/afl_frida/, you will have to write a harness to
	call the target function in the library, use afl-frida.c as a template.

	Both come with afl++ so this needs no URL.

	You can also perform remote fuzzing with frida, e.g. if you want to fuzz
	on iPhone or Android devices, for this you can use
	[https://github.com/ttdennis/fpicker/](https://github.com/ttdennis/fpicker/)
	as an intermediate that uses afl++ for fuzzing.

	If you like to code a customized fuzzer without much work, we highly
	recommend to check out our sister project libafl which supports Frida too:
	[https://github.com/AFLplusplus/LibAFL](https://github.com/AFLplusplus/LibAFL)
	Working examples already exist :-)


	## WINE+QEMU

	Wine mode can run Win32 PE binaries with the QEMU instrumentation.
	It needs Wine, python3 and the pefile python package installed.

	As it is included in afl++ this needs no URL.


	## UNICORN

	Unicorn is a fork of QEMU. The instrumentation is, therefore, very similar.
	In contrast to QEMU, Unicorn does not offer a full system or even userland
	emulation. Runtime environment and/or loaders have to be written from scratch,
	if needed. On top, block chaining has been removed. This means the speed boost
	introduced in the patched QEMU Mode of afl++ cannot simply be ported over to
	Unicorn. For further information, check out [unicorn_mode/README.md](../unicorn_mode/README.md).

	As it is included in afl++ this needs no URL.


	## AFL UNTRACER

	If you want to fuzz a binary-only shared library then you can fuzz it with
	utils/afl_untracer/, use afl-untracer.c as a template.
	It is slower than AFL FRIDA (see above).


	## DYNINST

	Dyninst is a binary instrumentation framework similar to Pintool and
	Dynamorio (see far below). However whereas Pintool and Dynamorio work at
	runtime, dyninst instruments the target at load time, and then let it run -
	or save the binary with the changes.
	This is great for some things, e.g. fuzzing, and not so effective for others,
	e.g. malware analysis.

	So what we can do with dyninst is taking every basic block, and put afl's
	instrumention code in there - and then save the binary.
	Afterwards we can just fuzz the newly saved target binary with afl-fuzz.
	Sounds great? It is. The issue though - it is a non-trivial problem to
	insert instructions, which change addresses in the process space, so that
	everything is still working afterwards. Hence more often than not binaries
	crash when they are run.

	The speed decrease is about 15-35%, depending on the optimization options
	used with afl-dyninst.

	So if Dyninst works, it is the best option available. Otherwise it just
	doesn't work well.

	[https://github.com/vanhauser-thc/afl-dyninst](https://github.com/vanhauser-thc/afl-dyninst)


	## RETROWRITE, ZAFL, ... other binary rewriter

	If you have an x86/x86_64 binary that still has its symbols, is compiled
	with position independant code (PIC/PIE) and does not use most of the C++
	features then the retrowrite solution might be for you.
	It decompiles to ASM files which can then be instrumented with afl-gcc.

	It is at about 80-85% performance.

	[https://git.zephyr-software.com/opensrc/zafl](https://git.zephyr-software.com/opensrc/zafl)
	[https://github.com/HexHive/retrowrite](https://github.com/HexHive/retrowrite)


	## MCSEMA

	Theoretically you can also decompile to llvm IR with mcsema, and then
	use llvm_mode to instrument the binary.
	Good luck with that.

	[https://github.com/lifting-bits/mcsema](https://github.com/lifting-bits/mcsema)


	## INTEL-PT

	If you have a newer Intel CPU, you can make use of Intels processor trace.
	The big issue with Intel's PT is the small buffer size and the complex
	encoding of the debug information collected through PT.
	This makes the decoding very CPU intensive and hence slow.
	As a result, the overall speed decrease is about 70-90% (depending on
	the implementation and other factors).

	There are two afl intel-pt implementations:

	1. [https://github.com/junxzm1990/afl-pt](https://github.com/junxzm1990/afl-pt)
	=> this needs Ubuntu 14.04.05 without any updates and the 4.4 kernel.

	2. [https://github.com/hunter-ht-2018/ptfuzzer](https://github.com/hunter-ht-2018/ptfuzzer)
	=> this needs a 4.14 or 4.15 kernel. the "nopti" kernel boot option must
	be used. This one is faster than the other.

	Note that there is also honggfuzz: https://github.com/google/honggfuzz
	But its IPT performance is just 6%!


	## CORESIGHT

	Coresight is ARM's answer to Intel's PT.
	There is no implementation so far which handles coresight and getting
	it working on an ARM Linux is very difficult due to custom kernel building
	on embedded systems is difficult. And finding one that has coresight in
	the ARM chip is difficult too.
	My guess is that it is slower than Qemu, but faster than Intel PT.

	If anyone finds any coresight implementation for afl please ping me: vh@thc.org


	## PIN & DYNAMORIO

	Pintool and Dynamorio are dynamic instrumentation engines, and they can be
	used for getting basic block information at runtime.
	Pintool is only available for Intel x32/x64 on Linux, Mac OS and Windows,
	whereas Dynamorio is additionally available for ARM and AARCH64.
	Dynamorio is also 10x faster than Pintool.

	The big issue with Dynamorio (and therefore Pintool too) is speed.
	Dynamorio has a speed decrease of 98-99%
	Pintool has a speed decrease of 99.5%

	Hence Dynamorio is the option to go for if everything else fails, and Pintool
	only if Dynamorio fails too.

	Dynamorio solutions:
	* [https://github.com/vanhauser-thc/afl-dynamorio](https://github.com/vanhauser-thc/afl-dynamorio)
	* [https://github.com/mxmssh/drAFL](https://github.com/mxmssh/drAFL)
	* [https://github.com/googleprojectzero/winafl/](https://github.com/googleprojectzero/winafl/) <= very good but windows only

	Pintool solutions:
	* [https://github.com/vanhauser-thc/afl-pin](https://github.com/vanhauser-thc/afl-pin)
	* [https://github.com/mothran/aflpin](https://github.com/mothran/aflpin)
	* [https://github.com/spinpx/afl_pin_mode](https://github.com/spinpx/afl_pin_mode) <= only old Pintool version supported


	## Non-AFL solutions

	There are many binary-only fuzzing frameworks.
	Some are great for CTFs but don't work with large binaries, others are very
	slow but have good path discovery, some are very hard to set-up ...

	* QSYM: [https://github.com/sslab-gatech/qsym](https://github.com/sslab-gatech/qsym)
	* Manticore: [https://github.com/trailofbits/manticore](https://github.com/trailofbits/manticore)
	* S2E: [https://github.com/S2E](https://github.com/S2E)
	* Tinyinst: [https://github.com/googleprojectzero/TinyInst](https://github.com/googleprojectzero/TinyInst) (Mac/Windows only)
	* Jackalope: [https://github.com/googleprojectzero/Jackalope](https://github.com/googleprojectzero/Jackalope)
	* ... please send me any missing that are good


	## Closing words

	That's it! News, corrections, updates? Send an email to vh@thc.org