[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAJ8uoz3xr=SRjvKKhxuoRDSvQ_s4DYPHYT5V0ZOZ7zGWV95=SA@mail.gmail.com>
Date: Thu, 11 Feb 2021 09:03:46 +0100
From: Magnus Karlsson <magnus.karlsson@...il.com>
To: Ciara Loftus <ciara.loftus@...el.com>
Cc: Network Development <netdev@...r.kernel.org>,
bpf <bpf@...r.kernel.org>,
"Karlsson, Magnus" <magnus.karlsson@...el.com>,
Björn Töpel <bjorn@...nel.org>,
Weqaar Janjua <weqaar.a.janjua@...el.com>,
Daniel Borkmann <daniel@...earbox.net>, song@...nel.org
Subject: Re: [PATCH bpf-next v5 0/6] AF_XDP Packet Drop Tracing
On Mon, Feb 8, 2021 at 10:39 AM Ciara Loftus <ciara.loftus@...el.com> wrote:
>
> This series introduces tracing infrastructure for AF_XDP sockets (xsk).
> A trace event 'xsk_packet_drop' is created which can be enabled by toggling
>
> /sys/kernel/debug/tracing/events/xsk/xsk_packet_drop/enable
>
> When enabled and packets or empty packet buffers are dropped in the kernel,
> traces are generated which describe the reason for the packet drop, the netdev
> and qid information of the xsk which encountered the drop, and some more
> information depending on what type of drop was encountered that will tell
> the user why the packet was dropped. This information should help a user
> troubleshoot packet drops by providing an extra level of detail which is not
> available through use of simple counters
>
> Example traces:
> xsk_packet_drop: netdev: ve3213 qid 0 reason: packet too big: len 3000 max 2048 not_used 0
> xsk_packet_drop: netdev: ve3213 qid 0 reason: invalid fill addr: addr 520192 not_used 0 not_used 0
> xsk_packet_drop: netdev: ve9266 qid 0 reason: invalid tx desc: addr 0 len 4097 options 0
>
> It was decided to use a single event 'xsk_packet_drop' to capture these three
> drop types. This means that for some of them, there is some redundant information
> in the trace marked as 'not_used'. An alternative to this would be to introduce 3
> separate event types under xsk, each with their own appropriate trace format.
> Suggestions are welcome on which approach would be better to take.
>
> The event can be monitored using perf:
> perf stat -a -e xsk:xsk_packet_drop
>
> A selftest is added for each drop type. These tests provide the conditions to
> trigger the traces and ensure that the appropriate traces are generated.
So what you have done now is to remove all the trace points that
provided no added information on top of the stats counters. The ones
you have left, provide extra information. The two XSK_TRACE_INVALID_*
points provide the reason why the descriptor was dropped and
XSK_TRACE_DROP_PKT_TOO_BIG provides the size of the packet that was
dropped.
However, the XSK_TRACE_INVALID checks could be performed from user
space and the same data could be printed out from there. A developer
could add this, or we could have a verification mode in the ring
access functions in libbpf that could be turned on by the developer if
he sees the error counters in the kernel being increased. That only
leaves us with XSK_TRACE_DROP_PKT_TOO_BIG which cannot be tested by
user space since the ingress packet is being dropped by the kernel.
But this is unfortunately not enough to warrant this trace
infrastructure on its own, so I think we should drop this patch set.
But I would really like to salvage all your tests, because they are
needed. Instead of verifying the trace, could you please verify the
stats counters that are already there? A lot of your code will be
applicable for that case too. So my suggestion is that you drop this
patch set and produce a new one that only focuses on selftests for
the XDP_STATISTICS getsockopt. You can add some of the tests you had
in your v2. What do you think?
Thank you.
> v4->v5:
> * Removed whitespace and renamed struct name in if_xdp.h as suggested by Song.
>
> v3->v4:
> * Fixed selftest commits with correct logs
> * Fixed warnings reported by W=1 build: trace argument types and print formatting
>
> v2->v3:
> * Removed some traces which traced events which were not technically drops eg.
> when the rxq is full.
> * Introduced traces for descriptor validation on RX and TX and selftests for both
>
> v1->v2:
> * Rebase on top of Björn's cleanup series.
> * Fixed packet count for trace tests which don't need EOT frame.
>
> This series applies on commit 23a2d70c7a2f28eb1a8f6bc19d68d23968cad0ce
>
> Ciara Loftus (6):
> xsk: add tracepoints for packet drops
> selftests/bpf: restructure setting the packet count
> selftests/bpf: add framework for xsk selftests
> selftests/bpf: XSK_TRACE_DROP_PKT_TOO_BIG test
> selftests/bpf: XSK_TRACE_INVALID_FILLADDR test
> selftests/bpf: XSK_TRACE_INVALID_DESC_TX test
>
> MAINTAINERS | 1 +
> include/linux/bpf_trace.h | 1 +
> include/trace/events/xsk.h | 71 +++++++
> include/uapi/linux/if_xdp.h | 6 +
> kernel/bpf/core.c | 1 +
> net/xdp/xsk.c | 7 +-
> net/xdp/xsk_buff_pool.c | 3 +
> net/xdp/xsk_queue.h | 4 +
> tools/include/uapi/linux/if_xdp.h | 6 +
> tools/testing/selftests/bpf/test_xsk.sh | 90 ++++++++-
> tools/testing/selftests/bpf/xdpxceiver.c | 206 +++++++++++++++++++--
> tools/testing/selftests/bpf/xdpxceiver.h | 9 +
> tools/testing/selftests/bpf/xsk_prereqs.sh | 3 +-
> 13 files changed, 379 insertions(+), 29 deletions(-)
> create mode 100644 include/trace/events/xsk.h
>
> --
> 2.17.1
>
Powered by blists - more mailing lists