| ============================================ |
| Fast LLVM-based instrumentation for afl-fuzz |
| ============================================ |
| |
| (See ../docs/README for the general instruction manual.) |
| (See ../gcc_plugin/README.gcc for the GCC-based instrumentation.) |
| |
| 1) Introduction |
| --------------- |
| |
| ! llvm_mode works with llvm version 3.8.1 up to 8.x ! |
| ! llvm version 9 does not work yet ! |
| |
| The code in this directory allows you to instrument programs for AFL using |
| true compiler-level instrumentation, instead of the more crude |
| assembly-level rewriting approach taken by afl-gcc and afl-clang. This has |
| several interesting properties: |
| |
| - The compiler can make many optimizations that are hard to pull off when |
| manually inserting assembly. As a result, some slow, CPU-bound programs will |
| run up to around 2x faster. |
| |
| The gains are less pronounced for fast binaries, where the speed is limited |
| chiefly by the cost of creating new processes. In such cases, the gain will |
| probably stay within 10%. |
| |
| - The instrumentation is CPU-independent. At least in principle, you should |
| be able to rely on it to fuzz programs on non-x86 architectures (after |
| building afl-fuzz with AFL_NO_X86=1). |
| |
| - The instrumentation can cope a bit better with multi-threaded targets. |
| |
| - Because the feature relies on the internals of LLVM, it is clang-specific |
| and will *not* work with GCC (see ../gcc_plugin/ for an alternative). |
| |
| Once this implementation is shown to be sufficiently robust and portable, it |
| will probably replace afl-clang. For now, it can be built separately and |
| co-exists with the original code. |
| |
| The idea and much of the implementation comes from Laszlo Szekeres. |
| |
| 2) How to use this |
| ------------------ |
| |
| In order to leverage this mechanism, you need to have clang installed on your |
| system. You should also make sure that the llvm-config tool is in your path |
| (or pointed to via LLVM_CONFIG in the environment). |
| |
| Unfortunately, some systems that do have clang come without llvm-config or the |
| LLVM development headers; one example of this is FreeBSD. FreeBSD users will |
| also run into problems with clang being built statically and not being able to |
| load modules (you'll see "Service unavailable" when loading afl-llvm-pass.so). |
| |
| To solve all your problems, you can grab pre-built binaries for your OS from: |
| |
| http://llvm.org/releases/download.html |
| |
| ...and then put the bin/ directory from the tarball at the beginning of your |
| $PATH when compiling the feature and building packages later on. You don't need |
| to be root for that. |
| |
| To build the instrumentation itself, type 'make'. This will generate binaries |
| called afl-clang-fast and afl-clang-fast++ in the parent directory. Once this |
| is done, you can instrument third-party code in a way similar to the standard |
| operating mode of AFL, e.g.: |
| |
| CC=/path/to/afl/afl-clang-fast ./configure [...options...] |
| make |
| |
| Be sure to also include CXX set to afl-clang-fast++ for C++ code. |
| |
| The tool honors roughly the same environmental variables as afl-gcc (see |
| ../docs/env_variables.txt). This includes AFL_USE_ASAN, |
| AFL_HARDEN, and AFL_DONT_OPTIMIZE. However AFL_INST_RATIO is not honored |
| as it does not serve a good purpose with the more effective instrim CFG |
| analysis. |
| |
| Note: if you want the LLVM helper to be installed on your system for all |
| users, you need to build it before issuing 'make install' in the parent |
| directory. |
| |
| 3) Options |
| |
| Several options are present to make llvm_mode faster or help it rearrange |
| the code to make afl-fuzz path discovery easier. |
| |
| If you need just to instrument specific parts of the code, you can whitelist |
| which C/C++ files to actually intrument. See README.whitelist |
| |
| For splitting memcmp, strncmp, etc. please see README.laf-intel |
| |
| Then there is an optimized instrumentation strategy that uses CFGs and |
| markers to just instrument what is needed. This increases speed by 20-25% |
| however has a lower path discovery. |
| If you want to use this, set AFL_LLVM_INSTRIM=1 |
| See README.instrim |
| |
| Finally if your llvm version is 8 or lower, you can activate a mode that |
| prevents that a counter overflow result in a 0 value. This is good for |
| path discovery, but the llvm implementation for intel for this functionality |
| is not optimal and was only fixed in llvm 9. |
| You can set this with AFL_LLVM_NOT_ZERO=1 |
| See README.neverzero |
| |
| |
| 4) Gotchas, feedback, bugs |
| -------------------------- |
| |
| This is an early-stage mechanism, so field reports are welcome. You can send bug |
| reports to <afl-users@googlegroups.com>. |
| |
| 5) Bonus feature #1: deferred initialization |
| -------------------------------------------- |
| |
| AFL tries to optimize performance by executing the targeted binary just once, |
| stopping it just before main(), and then cloning this "master" process to get |
| a steady supply of targets to fuzz. |
| |
| Although this approach eliminates much of the OS-, linker- and libc-level |
| costs of executing the program, it does not always help with binaries that |
| perform other time-consuming initialization steps - say, parsing a large config |
| file before getting to the fuzzed data. |
| |
| In such cases, it's beneficial to initialize the forkserver a bit later, once |
| most of the initialization work is already done, but before the binary attempts |
| to read the fuzzed input and parse it; in some cases, this can offer a 10x+ |
| performance gain. You can implement delayed initialization in LLVM mode in a |
| fairly simple way. |
| |
| First, find a suitable location in the code where the delayed cloning can |
| take place. This needs to be done with *extreme* care to avoid breaking the |
| binary. In particular, the program will probably malfunction if you select |
| a location after: |
| |
| - The creation of any vital threads or child processes - since the forkserver |
| can't clone them easily. |
| |
| - The initialization of timers via setitimer() or equivalent calls. |
| |
| - The creation of temporary files, network sockets, offset-sensitive file |
| descriptors, and similar shared-state resources - but only provided that |
| their state meaningfully influences the behavior of the program later on. |
| |
| - Any access to the fuzzed input, including reading the metadata about its |
| size. |
| |
| With the location selected, add this code in the appropriate spot: |
| |
| #ifdef __AFL_HAVE_MANUAL_CONTROL |
| __AFL_INIT(); |
| #endif |
| |
| You don't need the #ifdef guards, but including them ensures that the program |
| will keep working normally when compiled with a tool other than afl-clang-fast. |
| |
| Finally, recompile the program with afl-clang-fast (afl-gcc or afl-clang will |
| *not* generate a deferred-initialization binary) - and you should be all set! |
| |
| 6) Bonus feature #2: persistent mode |
| ------------------------------------ |
| |
| Some libraries provide APIs that are stateless, or whose state can be reset in |
| between processing different input files. When such a reset is performed, a |
| single long-lived process can be reused to try out multiple test cases, |
| eliminating the need for repeated fork() calls and the associated OS overhead. |
| |
| The basic structure of the program that does this would be: |
| |
| while (__AFL_LOOP(1000)) { |
| |
| /* Read input data. */ |
| /* Call library code to be fuzzed. */ |
| /* Reset state. */ |
| |
| } |
| |
| /* Exit normally */ |
| |
| The numerical value specified within the loop controls the maximum number |
| of iterations before AFL will restart the process from scratch. This minimizes |
| the impact of memory leaks and similar glitches; 1000 is a good starting point, |
| and going much higher increases the likelihood of hiccups without giving you |
| any real performance benefits. |
| |
| A more detailed template is shown in ../experimental/persistent_demo/. |
| Similarly to the previous mode, the feature works only with afl-clang-fast; |
| #ifdef guards can be used to suppress it when using other compilers. |
| |
| Note that as with the previous mode, the feature is easy to misuse; if you |
| do not fully reset the critical state, you may end up with false positives or |
| waste a whole lot of CPU power doing nothing useful at all. Be particularly |
| wary of memory leaks and of the state of file descriptors. |
| |
| PS. Because there are task switches still involved, the mode isn't as fast as |
| "pure" in-process fuzzing offered, say, by LLVM's LibFuzzer; but it is a lot |
| faster than the normal fork() model, and compared to in-process fuzzing, |
| should be a lot more robust. |
| |
| 8) Bonus feature #3: new 'trace-pc-guard' mode |
| ---------------------------------------------- |
| |
| Recent versions of LLVM are shipping with a built-in execution tracing feature |
| that provides AFL with the necessary tracing data without the need to |
| post-process the assembly or install any compiler plugins. See: |
| |
| http://clang.llvm.org/docs/SanitizerCoverage.html#tracing-pcs-with-guards |
| |
| If you have a sufficiently recent compiler and want to give it a try, build |
| afl-clang-fast this way: |
| |
| AFL_TRACE_PC=1 make clean all |
| |
| Note that this mode is currently about 20% slower than "vanilla" afl-clang-fast, |
| and about 5-10% slower than afl-clang. This is likely because the |
| instrumentation is not inlined, and instead involves a function call. On systems |
| that support it, compiling your target with -flto should help. |
| |
| |