[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <0cd88199e7f74230bef42e9a469c256f@intel.com>
Date: Thu, 11 Feb 2021 08:18:33 +0000
From: "Loftus, Ciara" <ciara.loftus@...el.com>
To: Magnus Karlsson <magnus.karlsson@...il.com>
CC: Network Development <netdev@...r.kernel.org>,
bpf <bpf@...r.kernel.org>,
"Karlsson, Magnus" <magnus.karlsson@...el.com>,
Björn Töpel <bjorn@...nel.org>,
"Janjua, Weqaar A" <weqaar.a.janjua@...el.com>,
Daniel Borkmann <daniel@...earbox.net>,
"song@...nel.org" <song@...nel.org>
Subject: RE: [PATCH bpf-next v5 0/6] AF_XDP Packet Drop Tracing
> >
> > This series introduces tracing infrastructure for AF_XDP sockets (xsk).
> > A trace event 'xsk_packet_drop' is created which can be enabled by
> toggling
> >
> > /sys/kernel/debug/tracing/events/xsk/xsk_packet_drop/enable
> >
> > When enabled and packets or empty packet buffers are dropped in the
> kernel,
> > traces are generated which describe the reason for the packet drop, the
> netdev
> > and qid information of the xsk which encountered the drop, and some
> more
> > information depending on what type of drop was encountered that will tell
> > the user why the packet was dropped. This information should help a user
> > troubleshoot packet drops by providing an extra level of detail which is not
> > available through use of simple counters
> >
> > Example traces:
> > xsk_packet_drop: netdev: ve3213 qid 0 reason: packet too big: len 3000
> max 2048 not_used 0
> > xsk_packet_drop: netdev: ve3213 qid 0 reason: invalid fill addr: addr 520192
> not_used 0 not_used 0
> > xsk_packet_drop: netdev: ve9266 qid 0 reason: invalid tx desc: addr 0 len
> 4097 options 0
> >
> > It was decided to use a single event 'xsk_packet_drop' to capture these
> three
> > drop types. This means that for some of them, there is some redundant
> information
> > in the trace marked as 'not_used'. An alternative to this would be to
> introduce 3
> > separate event types under xsk, each with their own appropriate trace
> format.
> > Suggestions are welcome on which approach would be better to take.
> >
> > The event can be monitored using perf:
> > perf stat -a -e xsk:xsk_packet_drop
> >
> > A selftest is added for each drop type. These tests provide the conditions
> to
> > trigger the traces and ensure that the appropriate traces are generated.
>
> So what you have done now is to remove all the trace points that
> provided no added information on top of the stats counters. The ones
> you have left, provide extra information. The two XSK_TRACE_INVALID_*
> points provide the reason why the descriptor was dropped and
> XSK_TRACE_DROP_PKT_TOO_BIG provides the size of the packet that was
> dropped.
>
> However, the XSK_TRACE_INVALID checks could be performed from user
> space and the same data could be printed out from there. A developer
> could add this, or we could have a verification mode in the ring
> access functions in libbpf that could be turned on by the developer if
> he sees the error counters in the kernel being increased. That only
> leaves us with XSK_TRACE_DROP_PKT_TOO_BIG which cannot be tested by
> user space since the ingress packet is being dropped by the kernel.
> But this is unfortunately not enough to warrant this trace
> infrastructure on its own, so I think we should drop this patch set.
>
> But I would really like to salvage all your tests, because they are
> needed. Instead of verifying the trace, could you please verify the
> stats counters that are already there? A lot of your code will be
> applicable for that case too. So my suggestion is that you drop this
> patch set and produce a new one that only focuses on selftests for
> the XDP_STATISTICS getsockopt. You can add some of the tests you had
> in your v2. What do you think?
Thanks for your feedback Magnus. I see your point. If it's possible to do the invalid desc checking in userspace then this probably doesn't belong in the kernel, and keeping the entire tracing infrastructure just for one trace would be too heavy.
I like your suggestion re stats selftests. I'll submit something along those lines.
Thanks,
Ciara
>
> Thank you.
>
> > v4->v5:
> > * Removed whitespace and renamed struct name in if_xdp.h as suggested
> by Song.
> >
> > v3->v4:
> > * Fixed selftest commits with correct logs
> > * Fixed warnings reported by W=1 build: trace argument types and print
> formatting
> >
> > v2->v3:
> > * Removed some traces which traced events which were not technically
> drops eg.
> > when the rxq is full.
> > * Introduced traces for descriptor validation on RX and TX and selftests for
> both
> >
> > v1->v2:
> > * Rebase on top of Björn's cleanup series.
> > * Fixed packet count for trace tests which don't need EOT frame.
> >
> > This series applies on commit
> 23a2d70c7a2f28eb1a8f6bc19d68d23968cad0ce
> >
> > Ciara Loftus (6):
> > xsk: add tracepoints for packet drops
> > selftests/bpf: restructure setting the packet count
> > selftests/bpf: add framework for xsk selftests
> > selftests/bpf: XSK_TRACE_DROP_PKT_TOO_BIG test
> > selftests/bpf: XSK_TRACE_INVALID_FILLADDR test
> > selftests/bpf: XSK_TRACE_INVALID_DESC_TX test
> >
> > MAINTAINERS | 1 +
> > include/linux/bpf_trace.h | 1 +
> > include/trace/events/xsk.h | 71 +++++++
> > include/uapi/linux/if_xdp.h | 6 +
> > kernel/bpf/core.c | 1 +
> > net/xdp/xsk.c | 7 +-
> > net/xdp/xsk_buff_pool.c | 3 +
> > net/xdp/xsk_queue.h | 4 +
> > tools/include/uapi/linux/if_xdp.h | 6 +
> > tools/testing/selftests/bpf/test_xsk.sh | 90 ++++++++-
> > tools/testing/selftests/bpf/xdpxceiver.c | 206 +++++++++++++++++++--
> > tools/testing/selftests/bpf/xdpxceiver.h | 9 +
> > tools/testing/selftests/bpf/xsk_prereqs.sh | 3 +-
> > 13 files changed, 379 insertions(+), 29 deletions(-)
> > create mode 100644 include/trace/events/xsk.h
> >
> > --
> > 2.17.1
> >
Powered by blists - more mailing lists