btf loader: Support raw BTF as available in /sys/kernel/btf/vmlinux
Be it automatically when no -F option is passed and
/sys/kernel/btf/vmlinux is available, or when /sys/kernel/btf/vmlinux is
passed as the filename to the tool, i.e.:
$ pahole -C list_head
struct list_head {
struct list_head * next; /* 0 8 */
struct list_head * prev; /* 8 8 */
/* size: 16, cachelines: 1, members: 2 */
/* last cacheline: 16 bytes */
};
$ strace -e openat pahole -C list_head |& grep /sys/kernel/btf/
openat(AT_FDCWD, "/sys/kernel/btf/vmlinux", O_RDONLY) = 3
$
$ pahole -C list_head /sys/kernel/btf/vmlinux
struct list_head {
struct list_head * next; /* 0 8 */
struct list_head * prev; /* 8 8 */
/* size: 16, cachelines: 1, members: 2 */
/* last cacheline: 16 bytes */
};
$
If one wants to grab the matching vmlinux to use its DWARF info instead,
which is useful to compare the results with what we have from BTF, for
instance, its just a matter of using '-F dwarf'.
This in turn shows something that at first came as a surprise, but then
has a simple explanation:
For very common data structures, that will probably appear in all of the
DWARF CUs (Compilation Units), like 'struct list_head', using '-F dwarf'
is faster:
[acme@quaco pahole]$ perf stat -e cycles pahole -F btf -C list_head > /dev/null
Performance counter stats for 'pahole -F btf -C list_head':
45,722,518 cycles:u
0.023717300 seconds time elapsed
0.016474000 seconds user
0.007212000 seconds sys
[acme@quaco pahole]$ perf stat -e cycles pahole -F dwarf -C list_head > /dev/null
Performance counter stats for 'pahole -F dwarf -C list_head':
14,170,321 cycles:u
0.006668904 seconds time elapsed
0.005562000 seconds user
0.001109000 seconds sys
[acme@quaco pahole]$
But for something that is more specific to a subsystem, the DWARF loader
will have to process way more stuff till it gets to that struct:
$ perf stat -e cycles pahole -F dwarf -C tcp_sock > /dev/null
Performance counter stats for 'pahole -F dwarf -C tcp_sock':
31,579,795,238 cycles:u
8.332272930 seconds time elapsed
8.032124000 seconds user
0.286537000 seconds sys
$
While using the BTF loader the time should be constant, as it loads
everything from /sys/kernel/btf/vmlinux:
$ perf stat -e cycles pahole -F btf -C tcp_sock > /dev/null
Performance counter stats for 'pahole -F btf -C tcp_sock':
48,823,488 cycles:u
0.024102760 seconds time elapsed
0.012035000 seconds user
0.012046000 seconds sys
$
Above I used '-F btf' just to show that it can be used, but its not
really needed, i.e. those are equivalent:
$ strace -e openat pahole -F btf -C list_head |& grep /sys/kernel/btf/vmlinux
openat(AT_FDCWD, "/sys/kernel/btf/vmlinux", O_RDONLY) = 3
$ strace -e openat pahole -C list_head |& grep /sys/kernel/btf/vmlinux
openat(AT_FDCWD, "/sys/kernel/btf/vmlinux", O_RDONLY) = 3
$
The btf_raw__load() function that ends up being grafted into the
preexisting btf_elf routines was based on libbpf's btf_load_raw().
Acked-by: Alexei Starovoitov <ast@fb.com>
Cc: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
3 files changed