blob: 4cadc76a97c774dc51b7ece2032e43fefd75523e [file] [log] [blame]
Demonstrations of btrfsdist, the Linux eBPF/bcc version.
btrfsdist traces btrfs reads, writes, opens, and fsyncs, and summarizes their
latency as a power-of-2 histogram. For example:
# ./btrfsdist
Tracing btrfs operation latency... Hit Ctrl-C to end.
^C
operation = 'read'
usecs : count distribution
0 -> 1 : 15 | |
2 -> 3 : 1308 |******* |
4 -> 7 : 198 |* |
8 -> 15 : 0 | |
16 -> 31 : 11 | |
32 -> 63 : 361 |* |
64 -> 127 : 55 | |
128 -> 255 : 104 | |
256 -> 511 : 7312 |****************************************|
512 -> 1023 : 387 |** |
1024 -> 2047 : 10 | |
2048 -> 4095 : 4 | |
operation = 'write'
usecs : count distribution
0 -> 1 : 0 | |
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 4 |****************************************|
operation = 'open'
usecs : count distribution
0 -> 1 : 1 |********** |
2 -> 3 : 4 |****************************************|
This output shows a bi-modal distribution for read latency, with a faster
mode of 1,308 reads that took between 2 and 3 microseconds, and a slower
mode of over 7,312 reads that took between 256 and 511 microseconds. It's
likely that the faster mode was a hit from the in-memory file system cache,
and the slower mode is a read from a storage device (disk).
This "latency" is measured from when the operation was issued from the VFS
interface to the file system, to when it completed. This spans everything:
block device I/O (disk I/O), file system CPU cycles, file system locks, run
queue latency, etc. This is a better measure of the latency suffered by
applications reading from the file system than measuring this down at the
block device interface.
Note that this only traces the common file system operations previously
listed: other file system operations (eg, inode operations including
getattr()) are not traced.
An optional interval and a count can be provided, as well as -m to show the
distributions in milliseconds. For example, two second summaries, five times:
# ./btrfsdist 2 5
Tracing btrfs operation latency... Hit Ctrl-C to end.
03:40:49:
operation = 'read'
usecs : count distribution
0 -> 1 : 15 | |
2 -> 3 : 833 |******** |
4 -> 7 : 127 |* |
8 -> 15 : 0 | |
16 -> 31 : 8 | |
32 -> 63 : 907 |******** |
64 -> 127 : 91 | |
128 -> 255 : 246 |** |
256 -> 511 : 4164 |****************************************|
512 -> 1023 : 193 |* |
1024 -> 2047 : 4 | |
2048 -> 4095 : 6 | |
4096 -> 8191 : 2 | |
03:40:51:
operation = 'read'
usecs : count distribution
0 -> 1 : 25 | |
2 -> 3 : 1491 |*************** |
4 -> 7 : 218 |** |
8 -> 15 : 0 | |
16 -> 31 : 16 | |
32 -> 63 : 1527 |*************** |
64 -> 127 : 319 |*** |
128 -> 255 : 429 |**** |
256 -> 511 : 3841 |****************************************|
512 -> 1023 : 232 |** |
1024 -> 2047 : 3 | |
2048 -> 4095 : 6 | |
4096 -> 8191 : 1 | |
8192 -> 16383 : 1 | |
03:40:53:
operation = 'read'
usecs : count distribution
0 -> 1 : 27 | |
2 -> 3 : 2999 |********************************* |
4 -> 7 : 407 |**** |
8 -> 15 : 0 | |
16 -> 31 : 46 | |
32 -> 63 : 3538 |****************************************|
64 -> 127 : 595 |****** |
128 -> 255 : 621 |******* |
256 -> 511 : 3532 |*************************************** |
512 -> 1023 : 212 |** |
1024 -> 2047 : 1 | |
2048 -> 4095 : 0 | |
4096 -> 8191 : 0 | |
8192 -> 16383 : 0 | |
16384 -> 32767 : 1 | |
03:40:55:
operation = 'read'
usecs : count distribution
0 -> 1 : 221 | |
2 -> 3 : 12580 |****************************************|
4 -> 7 : 1366 |**** |
8 -> 15 : 0 | |
16 -> 31 : 289 | |
32 -> 63 : 10782 |********************************** |
64 -> 127 : 1232 |*** |
128 -> 255 : 807 |** |
256 -> 511 : 2299 |******* |
512 -> 1023 : 135 | |
1024 -> 2047 : 5 | |
2048 -> 4095 : 2 | |
03:40:57:
operation = 'read'
usecs : count distribution
0 -> 1 : 73951 |************************* |
2 -> 3 : 117639 |****************************************|
4 -> 7 : 7943 |** |
8 -> 15 : 1841 | |
16 -> 31 : 1143 | |
32 -> 63 : 5006 |* |
64 -> 127 : 483 | |
128 -> 255 : 242 | |
256 -> 511 : 253 | |
512 -> 1023 : 84 | |
1024 -> 2047 : 23 | |
This shows a read workload that begins bimodal, and eventually the second
mode disappears. The reason for this is that the workload cached during
tracing. Note that the rate also increased, with over 200k reads for the
final two second sample.
USAGE message:
# ./btrfsdist -h
usage: btrfsdist [-h] [-T] [-m] [-p PID] [interval] [count]
Summarize btrfs operation latency
positional arguments:
interval output interval, in seconds
count number of outputs
optional arguments:
-h, --help show this help message and exit
-T, --notimestamp don't include timestamp on interval output
-m, --milliseconds output in milliseconds
-p PID, --pid PID trace this PID only
examples:
./btrfsdist # show operation latency as a histogram
./btrfsdist -p 181 # trace PID 181 only
./btrfsdist 1 10 # print 1 second summaries, 10 times
./btrfsdist -m 5 # 5s summaries, milliseconds