[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CACAyw9998vkRPX3vZxf8cC6ivVfTFDJPY11Cz08ZUSTLf_s7=A@mail.gmail.com>
Date: Wed, 24 Nov 2021 16:34:08 +0000
From: Lorenz Bauer <lmb@...udflare.com>
To: Daniel Borkmann <daniel@...earbox.net>
Cc: Alexander Lobakin <alexandr.lobakin@...el.com>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>,
Jesse Brandeburg <jesse.brandeburg@...el.com>,
Michal Swiatkowski <michal.swiatkowski@...ux.intel.com>,
Maciej Fijalkowski <maciej.fijalkowski@...el.com>,
Jonathan Corbet <corbet@....net>,
Shay Agroskin <shayagr@...zon.com>,
Arthur Kiyanovski <akiyano@...zon.com>,
David Arinzon <darinzon@...zon.com>,
Noam Dagan <ndagan@...zon.com>,
Saeed Bishara <saeedb@...zon.com>,
Ioana Ciornei <ioana.ciornei@....com>,
Claudiu Manoil <claudiu.manoil@....com>,
Tony Nguyen <anthony.l.nguyen@...el.com>,
Thomas Petazzoni <thomas.petazzoni@...tlin.com>,
Marcin Wojtas <mw@...ihalf.com>,
Russell King <linux@...linux.org.uk>,
Saeed Mahameed <saeedm@...dia.com>,
Leon Romanovsky <leon@...nel.org>,
Alexei Starovoitov <ast@...nel.org>,
Jesper Dangaard Brouer <hawk@...nel.org>,
Toke Høiland-Jørgensen <toke@...hat.com>,
John Fastabend <john.fastabend@...il.com>,
Edward Cree <ecree.xilinx@...il.com>,
Martin Habets <habetsm.xilinx@...il.com>,
"Michael S. Tsirkin" <mst@...hat.com>,
Jason Wang <jasowang@...hat.com>,
Andrii Nakryiko <andrii@...nel.org>,
Martin KaFai Lau <kafai@...com>,
Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
KP Singh <kpsingh@...nel.org>,
Lorenzo Bianconi <lorenzo@...nel.org>,
Yajun Deng <yajun.deng@...ux.dev>,
Sergey Ryazanov <ryazanov.s.a@...il.com>,
David Ahern <dsahern@...nel.org>,
Andrei Vagin <avagin@...il.com>,
Johannes Berg <johannes.berg@...el.com>,
Vladimir Oltean <vladimir.oltean@....com>,
Cong Wang <cong.wang@...edance.com>,
Networking <netdev@...r.kernel.org>, linux-doc@...r.kernel.org,
LKML <linux-kernel@...r.kernel.org>, linux-rdma@...r.kernel.org,
bpf <bpf@...r.kernel.org>,
virtualization@...ts.linux-foundation.org
Subject: Re: [PATCH v2 net-next 21/26] ice: add XDP and XSK generic
per-channel statistics
Daniel asked me to share my opinion, as Cloudflare has an XDP load
balancer as well.
On Wed, 24 Nov 2021 at 00:53, Daniel Borkmann <daniel@...earbox.net> wrote:
> I'm just taking our XDP L4LB in Cilium as an example: there we already count errors and
> export them via per-cpu map that eventually lead to XDP_DROP cases including the /reason/
> which caused the XDP_DROP (e.g. Prometheus can then scrape these insights from all the
> nodes in the cluster). Given the different action codes are very often application specific,
> there's not much debugging that you can do when /only/ looking at `ip link xdpstats` to
> gather insight on *why* some of these actions were triggered (e.g. fib lookup failure, etc).
Agreed. For our purpose we often want to know whether a specific
program has been invoked. Per-channel or per device stats don't help
us much since we have a chain of programs (not using libxdp though).
My colleague Arthur has written xdpcap [1], which gives per-action,
per-program counters. This way we can correlate an action with a
packet and a program.
> If really of interest, then maybe libxdp could have such per-action counters as opt-in in
> its call chain..
We could also make it part of BPF_ENABLE_STATS, it's kind of coarse
grained though.
> In the case of ice_run_xdp() today, we already bump total_rx_bytes/total_rx_pkts under
> XDP and update ice_update_rx_ring_stats(). I do see the case for XDP_TX and XDP_REDIRECT
> where we run into driver-specific errors that are /outside of the reach/ of the BPF prog.
> For example, we've been running into errors from XDP_TX in ice_xmit_xdp_ring() in the
> past during testing, and were able to pinpoint the location as xdp_ring->tx_stats.tx_busy
> was increasing. These things are useful and would make sense to standardize for XDP context.
I'd like to see more tracepoints like trace_xdp_exception, personally.
We can use things like bpftrace for exploration and ebpf_exporter [2]
to generate alerts much more easily than something wired into
iproute2.
Best
Lorenz
1: https://github.com/cloudflare/xdpcap
2: https://github.com/cloudflare/ebpf_exporter
--
Lorenz Bauer | Systems Engineer
6th Floor, County Hall/The Riverside Building, SE1 7PB, UK
www.cloudflare.com
Powered by blists - more mailing lists