| .TH trace 8 "2016-02-18" "USER COMMANDS" |
| .SH NAME |
| trace \- Trace a function and print its arguments or return value, optionally evaluating a filter. Uses Linux eBPF/bcc. |
| .SH SYNOPSIS |
| .B trace [-h] [-b BUFFER_PAGES] [-p PID] [-L TID] [--uid UID] [-v] [-Z STRING_SIZE] [-S] [-s SYM_FILE_LIST] |
| [-M MAX_EVENTS] [-t] [-u] [-T] [-C] [-K] [-U] [-a] [-I header] [-A] |
| probe [probe ...] |
| .SH DESCRIPTION |
| trace probes functions you specify and displays trace messages if a particular |
| condition is met. You can control the message format to display function |
| arguments and return values. |
| |
| Since this uses BPF, only the root user can use this tool. |
| .SH REQUIREMENTS |
| CONFIG_BPF and bcc. |
| .SH OPTIONS |
| .TP |
| \-h |
| Print usage message. |
| .TP |
| \-p PID |
| Trace only functions in the process PID. |
| .TP |
| \-L TID |
| Trace only functions in the thread TID. |
| .TP |
| \--uid UID |
| Trace only functions from user UID. |
| .TP |
| \-v |
| Display the generated BPF program, for debugging purposes. |
| .TP |
| \-z STRING_SIZE |
| When collecting string arguments (of type char*), collect up to STRING_SIZE |
| characters. Longer strings will be truncated. |
| .TP |
| \-s SYM_FILE_LIST |
| When collecting stack trace in build id format, use the coma separated list for |
| symbol resolution. |
| .TP |
| \-S |
| If set, trace messages from trace's own process. By default, this is off to |
| avoid tracing storms -- for example, if you trace the write system call, and |
| consider that trace is writing to the standard output. |
| .TP |
| \-M MAX_EVENTS |
| Print up to MAX_EVENTS trace messages and then exit. |
| .TP |
| \-t |
| Print times relative to the beginning of the trace (offsets), in seconds. |
| .TP |
| \-u |
| Print UNIX timestamps instead of offsets from trace beginning, requires -t. |
| .TP |
| \-T |
| Print the time column. |
| .TP |
| \-C |
| Print CPU id. |
| .TP |
| \-c CGROUP_PATH |
| Trace only functions in processes under CGROUP_PATH hierarchy. |
| .TP |
| \-n NAME |
| Only print process names containing this name. |
| .TP |
| \-f MSG_FILTER |
| Only print message of event containing this string. |
| .TP |
| \-B |
| Treat argument of STRCMP helper as a binary value |
| .TP |
| \-K |
| Print the kernel stack for each event. |
| .TP |
| \-U |
| Print the user stack for each event. |
| \-a |
| Print virtual address in kernel and user stacks. |
| .TP |
| \-I header |
| Additional header files to include in the BPF program. This is needed if your |
| filter or print expressions use types or data structures that are not available |
| in the standard headers. For example: 'linux/mm.h' |
| .TP |
| \-A |
| Print aggregated amount of each trace. This should be used with -M/--max-events together. |
| .TP |
| probe [probe ...] |
| One or more probes that attach to functions, filter conditions, and print |
| information. See PROBE SYNTAX below. |
| .SH PROBE SYNTAX |
| The general probe syntax is as follows: |
| |
| .B [{p,r}]:[library]:function[+offset][(signature)] [(predicate)] ["format string"[, arguments]] |
| |
| .B {t:category:event,u:library:probe} [(predicate)] ["format string"[, arguments]] |
| .TP |
| .B {[{p,r}],t,u} |
| Probe type \- "p" for function entry, "r" for function return, "t" for kernel |
| tracepoint, "u" for USDT probe. The default probe type is "p". |
| .TP |
| .B [library] |
| Library containing the probe. |
| Specify the full path to the .so or executable file where the function to probe |
| resides. Alternatively, you can specify just the lib name: for example, "c" |
| refers to libc. If no library name is specified, the kernel is assumed. Also, |
| you can specify an executable name (without a full path) if it is in the PATH. |
| For example, "bash". |
| .TP |
| .B category |
| The tracepoint category. For example, "sched" or "irq". |
| .TP |
| .B function |
| The function to probe. |
| .B offset |
| The offset after the address of the function where the probe should injected. |
| For example "kfree_skb+56" in decimal or hexadecimal "kfree_skb+0x38" format. |
| Only works with kprobes and uprobes. Zero if omitted. |
| .TP |
| .B signature |
| The optional signature of the function to probe. This can make it easier to |
| access the function's arguments, instead of using the "arg1", "arg2" etc. |
| argument specifiers. For example, "(struct timespec *ts)" in the signature |
| position lets you use "ts" in the filter or print expressions. |
| .TP |
| .B event |
| The tracepoint event. For example, "block_rq_complete". |
| .TP |
| .B probe |
| The USDT probe name. For example, "pthread_create". |
| .TP |
| .B [(predicate)] |
| The filter applied to the captured data. Only if the filter evaluates as true, |
| the trace message will be printed. The filter can use any valid C expression |
| that refers to the argument values: arg1, arg2, etc., or to the return value |
| retval in a return probe. If necessary, use C cast operators to coerce the |
| arguments to the desired type. For example, if arg1 is of type int, use the |
| expression ((int)arg1 < 0) to trace only invocations where arg1 is negative. |
| Note that only arg1-arg6 are supported, and only if the function is using the |
| standard x86_64 convention where the first six arguments are in the RDI, RSI, |
| RDX, RCX, R8, R9 registers. If no predicate is specified, all function |
| invocations are traced. |
| |
| The predicate expression may also use the STRCMP pseudo-function to compare |
| a predefined string to a string argument. For example: STRCMP("test", arg1). |
| The order of arguments is important: the first argument MUST be a quoted |
| literal string, and the second argument can be a runtime string, most typically |
| an argument. |
| .TP |
| .B ["format string"[, arguments]] |
| A printf-style format string that will be used for the trace message. You can |
| use the following format specifiers: %s, %d, %u, %lld, %llu, %hd, %hu, %c, |
| %x, %llx -- with the same semantics as printf's. Make sure to pass the exact |
| number of arguments as there are placeholders in the format string. The |
| format specifier replacements may be any C expressions, and may refer to the |
| same special keywords as in the predicate (arg1, arg2, etc.). |
| |
| In addition to the above format specifiers, you can also use %K and %U when |
| the expression is an address that potentially points to executable code (i.e., |
| a symbol). trace will resolve %K specifiers to a kernel symbol, such as |
| vfs__read, and will resolve %U specifiers to a user-space symbol in that |
| process, such as sprintf. |
| |
| In tracepoints, both the predicate and the arguments may refer to the tracepoint |
| format structure, which is stored in the special "args" variable. For example, the |
| block:block_rq_complete tracepoint can print or filter by args->nr_sector. To |
| discover the format of your tracepoint, use the tplist tool. |
| |
| In USDT probes, the arg1, ..., argN variables refer to the probe's arguments. |
| To determine which arguments your probe has, use the tplist tool. |
| |
| The predicate expression and the format specifier replacements for printing |
| may also use the following special keywords: $pid, $tgid to refer to the |
| current process' pid and tgid; $uid, $gid to refer to the current user's |
| uid and gid; $cpu to refer to the current processor number. |
| .SH EXAMPLES |
| .TP |
| Trace all invocations of the open system call with the name of the file (from userspace) being opened: |
| # |
| .B trace '::do_sys_open """%s"", arg2@user' |
| .TP |
| Trace all invocations of the read system call where the number of bytes requested is greater than 20,000: |
| # |
| .B trace '::sys_read (arg3 > 20000) """read %d bytes"", arg3' |
| .TP |
| Trace all malloc calls and print the size of the requested allocation: |
| # |
| .B trace ':c:malloc """size = %d"", arg1' |
| .TP |
| Trace returns from the readline function in bash and print the return value as a string: |
| # |
| .B trace 'r:bash:readline """%s"", retval' |
| .TP |
| Trace the block:block_rq_complete tracepoint and print the number of sectors completed: |
| # |
| .B trace 't:block:block_rq_complete """%d sectors"", args->nr_sector' |
| .TP |
| Trace the pthread_create USDT probe from the pthread library and print the address of the thread's start function: |
| # |
| .B trace 'u:pthread:pthread_create """start addr = %llx"", arg3' |
| .TP |
| Trace the nanosleep system call and print the sleep duration in nanoseconds: |
| # |
| .B trace 'p::SyS_nanosleep(struct timespec *ts) "sleep for %lld ns", ts->tv_nsec' |
| .TP |
| Trace the inet_pton system call using build id mechanism and print the stack |
| # |
| .B trace -s /lib/x86_64-linux-gnu/libc.so.6,/bin/ping 'p:c:inet_pton' -U |
| .SH SOURCE |
| This is from bcc. |
| .IP |
| https://github.com/iovisor/bcc |
| .PP |
| Also look in the bcc distribution for a companion _examples.txt file containing |
| example usage, output, and commentary for this tool. |
| .SH OS |
| Linux |
| .SH STABILITY |
| Unstable - in development. |
| .SH AUTHOR |
| Sasha Goldshtein |