tree c521a9c061e9cc31281cc613cb9ed7b11e517286
parent 00967e84f742f87603e769529628e32076ade188
author Jesper Dangaard Brouer <brouer@redhat.com> 1555081652 +0200
committer Alexei Starovoitov <ast@kernel.org> 1555553364 -0700

bpf: cpumap use ptr_ring_consume_batched

Move ptr_ring dequeue outside loop, that allocate SKBs and calls network
stack, as these operations that can take some time. The ptr_ring is a
communication channel between CPUs, where we want to reduce/limit any
cacheline bouncing.

Do a concentrated bulk dequeue via ptr_ring_consume_batched, to shorten the
period and times the remote cacheline in ptr_ring is read

Batch size 8 is both to (1) limit BH-disable period, and (2) consume one
cacheline on 64-bit archs. After reducing the BH-disable section further
then we can consider changing this, while still thinking about L1 cacheline
size being active.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
