| ================================================== |
| Adding custom mutators to AFL using Python modules |
| ================================================== |
| |
| This file describes how you can utilize the external Python API to write |
| your own custom mutation routines. |
| |
| Note: This feature is highly experimental. Use at your own risk. |
| |
| Implemented by Christian Holler (:decoder) <choller@mozilla.com>. |
| |
| NOTE: This is for Python 2.7 ! |
| Anyone who wants to add Python 3.7 support is happily welcome :) |
| |
| For an example and a template see ../python_mutators/ |
| |
| |
| 1) Description and purpose |
| -------------------------- |
| |
| While AFLFuzz comes with a good selection of generic deterministic and |
| non-deterministic mutation operations, it sometimes might make sense to extend |
| these to implement strategies more specific to the target you are fuzzing. |
| |
| For simplicity and in order to allow people without C knowledge to extend |
| AFLFuzz, I implemented a "Python" stage that can make use of an external |
| module (written in Python) that implements a custom mutation stage. |
| |
| The main motivation behind this is to lower the barrier for people |
| experimenting with this tool. Hopefully, someone will be able to do useful |
| things with this extension. |
| |
| If you find it useful, have questions or need additional features added to the |
| interface, feel free to send a mail to <choller@mozilla.com>. |
| |
| See the following information to get a better pictures: |
| https://www.agarri.fr/docs/XML_Fuzzing-NullCon2017-PUBLIC.pdf |
| https://bugs.chromium.org/p/chromium/issues/detail?id=930663 |
| |
| |
| 2) How the Python module looks like |
| ----------------------------------- |
| |
| You can find a simple example in pymodules/example.py including documentation |
| explaining each function. In the same directory, you can find another simple |
| module that performs simple mutations. |
| |
| Right now, "init" is called at program startup and can be used to perform any |
| kinds of one-time initializations while "fuzz" is called each time a mutation |
| is requested. |
| |
| There is also optional support for a trimming API, see the section below for |
| further information about this feature. |
| |
| |
| 3) How to compile AFLFuzz with Python support |
| --------------------------------------------- |
| |
| You must install the python 2.7 development package of your Linux distribution |
| before this will work. On Debian/Ubuntu/Kali this can be done with: |
| apt install python2.7-dev |
| |
| A prerequisite for using this mode is to compile AFLFuzz with Python support. |
| |
| The afl Makefile performs some magic and detects Python 2.7 if it is in the |
| default path and compiles afl-fuzz with the feature if available (which is |
| /usr/include/python2.7 for the Python.h include and /usr/lib/x86_64-linux-gnu |
| for the libpython2.7.a library) |
| |
| In case your setup is different set the necessary variables like this: |
| PYTHON_INCLUDE=/path/to/python2.7/include LDFLAGS=-L/path/to/python2.7/lib make |
| |
| |
| 4) How to run AFLFuzz with your custom module |
| --------------------------------------------- |
| |
| You must pass the module name inside the env variable AFL_PYTHON_MODULE. |
| |
| In addition, if you are trying to load the module from the local directory, |
| you must adjust your PYTHONPATH to reflect this circumstance. The following |
| command should work if you are inside the aflfuzz directory: |
| |
| $ AFL_PYTHON_MODULE="pymodules.test" PYTHONPATH=. ./afl-fuzz |
| |
| Optionally, the following environment variables are supported: |
| |
| AFL_PYTHON_ONLY - Disable all other mutation stages. This can prevent broken |
| testcases (those that your Python module can't work with |
| anymore) to fill up your queue. Best combined with a custom |
| trimming routine (see below) because trimming can cause the |
| same test breakage like havoc and splice. |
| |
| AFL_DEBUG - When combined with AFL_NO_UI, this causes the C trimming code |
| to emit additional messages about the performance and actions |
| of your custom Python trimmer. Use this to see if it works :) |
| |
| |
| 5) Order and statistics |
| ----------------------- |
| |
| The Python stage is set to be the first non-deterministic stage (right before |
| the havoc stage). In the statistics however, it shows up as the third number |
| under "havoc". That's because I'm lazy and I didn't want to mess with the UI |
| too much ;) |
| |
| |
| 6) Trimming support |
| ------------------- |
| |
| The generic trimming routines implemented in AFLFuzz can easily destroy the |
| structure of complex formats, possibly leading to a point where you have a lot |
| of testcases in the queue that your Python module cannot process anymore but |
| your target application still accepts. This is especially the case when your |
| target can process a part of the input (causing coverage) and then errors out |
| on the remaining input. |
| |
| In such cases, it makes sense to implement a custom trimming routine in Python. |
| The API consists of multiple methods because after each trimming step, we have |
| to go back into the C code to check if the coverage bitmap is still the same |
| for the trimmed input. Here's a quick API description: |
| |
| init_trim: This method is called at the start of each trimming operation |
| and receives the initial buffer. It should return the amount |
| of iteration steps possible on this input (e.g. if your input |
| has n elements and you want to remove them one by one, return n, |
| if you do a binary search, return log(n), and so on...). |
| |
| If your trimming algorithm doesn't allow you to determine the |
| amount of (remaining) steps easily (esp. while running), then you |
| can alternatively return 1 here and always return 0 in post_trim |
| until you are finished and no steps remain. In that case, |
| returning 1 in post_trim will end the trimming routine. The whole |
| current index/max iterations stuff is only used to show progress. |
| |
| trim: This method is called for each trimming operation. It doesn't |
| have any arguments because we already have the initial buffer |
| from init_trim and we can memorize the current state in global |
| variables. This can also save reparsing steps for each iteration. |
| It should return the trimmed input buffer, where the returned data |
| must not exceed the initial input data in length. Returning anything |
| that is larger than the original data (passed to init_trim) will |
| result in a fatal abort of AFLFuzz. |
| |
| post_trim: This method is called after each trim operation to inform you |
| if your trimming step was successful or not (in terms of coverage). |
| If you receive a failure here, you should reset your input to the |
| last known good state. |
| In any case, this method must return the next trim iteration index |
| (from 0 to the maximum amount of steps you returned in init_trim). |
| |
| Omitting any of the methods will cause Python trimming to be disabled and |
| trigger a fallback to the builtin default trimming routine. |