syzkaller uses declarative description of syscall interfaces to manipulate programs (sequences of syscalls). Below you can see (hopefully self-explanatory) excerpt from the description:
open(file filename, flags flags[open_flags], mode flags[open_mode]) fd read(fd fd, buf buffer[out], count len[buf]) close(fd fd) open_mode = S_IRUSR, S_IWUSR, S_IXUSR, S_IRGRP, S_IWGRP, S_IXGRP, S_IROTH, S_IWOTH, S_IXOTH
The description is contained in
sys/OS/*.txt files. For example see the sys/linux/dev_snd_midi.txt file for descriptions of the Linux MIDI interfaces.
A more formal description of the description syntax can be found here.
The translated descriptions are then used to generate, mutate, execute, minimize, serialize and deserialize programs. A program is a sequences of syscalls with concrete values for arguments. Here is an example (of a textual representation) of a program:
r0 = open(&(0x7f0000000000)="./file0", 0x3, 0x9) read(r0, &(0x7f0000000000), 42) close(r0)
For actual manipulations
syzkaller uses in-memory AST-like representation consisting of
Arg values defined in prog/prog.go. That representation is used to analyze, generate, mutate, minimize, validate, etc programs.
The in-memory representation can be transformed to/from textual form to store in on-disk corpus, show to humans, etc.
There is also another binary representation of the programs (called
exec), that is much simpler, does not contains rich type information (irreversible) and is used for actual execution (interpretation) of programs by executor.
This section describes how to extend syzkaller to allow fuzz testing of a new system call; this is particularly useful for kernel developers who are proposing new system calls.
Syscall interfaces are manually-written. There is an open issue to provide some aid for this process and some ongoing work, but we are yet there. There is also headerparser utility that can auto-generate some parts of descriptions from header files.
First, add a declarative description of the new system call to the appropriate file:
sys/linux/<subsystem>.txtfiles hold system calls for particular kernel subsystems, for example
The description of the syntax can be found here.
After adding/changing descriptions run:
make extract TARGETOS=linux SOURCEDIR=$KSRC make generate make
make extract generates/updates the
$KSRC should point to the latest kernel checkout.
make extract overwrites
make generate updates generated code and
make rebuilds binaries.
make generate does not require any kernel sources, native compilers, etc and is pure text processing.
Note: all generated files (
*.h) are checked-in with the
*.txt changes in the same commit.
make extract extracts constants for all architectures which requires installed cross-compilers. If you get errors about missing compilers/libraries, try
sudo make install_prerequisites or install equivalent package for your distro.
If you want to fuzz the new subsystem that you described locally, you may find the
enable_syscalls configuration parameter useful to specifically target the new system calls.
When updating existing syzkaller descriptions, note, that unless there's a drastic change in descriptions for a particular syscall, the programs that are already in the corpus will be kept there, unless you manually clear them out (for example by removing the
The process of compiling the textual syscall descriptions into machine-usable form used by
syzkaller to actually generate programs consists of 2 steps.
The first step is extraction of values of symbolic constants from kernel sources using syz-extract utility.
syz-extract generates a small C program that includes kernel headers referenced by
include directives, defines macros as specified by
define directives and prints values of symbolic constants. Results are stored in
.const files, one per arch. For example, sys/linux/dev_ptmx.txt is translated into sys/linux/dev_ptmx_amd64.const.
The second step is translation of descriptions into Go code using syz-sysgen utility (the actual compiler code lives in pkg/ast and pkg/compiler). This step uses syscall descriptions and the const files generated during the first step and produces instantiations of
Type types defined in prog/types.go. Here is an example of the compiler output for Akaros. This step also generates some minimal syscall metadata for C++ code in executor/syscalls.h.
make extract extracts constants for all
*.txt files and for all supported architectures. This may not work for subsystems that are not present in mainline kernel or if you have problems with native kernel compilers, etc. In such cases the
syz-extract utility used by
make extract can be run manually for single file/arch as:
make bin/syz-extract bin/syz-extract -os linux -arch $ARCH -sourcedir $KSRC -builddir $LINUXBLD <new>.txt
$ARCH is one of
ppc64le. If the subsystem is supported on several architectures, then run
syz-extract for each arch.
$LINUX should point to kernel source checkout, which is configured for the corresponding arch (i.e. you need to run
make someconfig && make there first). If the kernel was built into a separate directory (with
make O=...) then also set
$LINUXBLD to the location of the build directory.