[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20211130155612.594688-1-alexandr.lobakin@intel.com>
Date: Tue, 30 Nov 2021 16:56:12 +0100
From: Alexander Lobakin <alexandr.lobakin@...el.com>
To: "David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>
Cc: Alexander Lobakin <alexandr.lobakin@...el.com>,
Jesse Brandeburg <jesse.brandeburg@...el.com>,
Michal Swiatkowski <michal.swiatkowski@...ux.intel.com>,
Maciej Fijalkowski <maciej.fijalkowski@...el.com>,
Jonathan Corbet <corbet@....net>,
Shay Agroskin <shayagr@...zon.com>,
Arthur Kiyanovski <akiyano@...zon.com>,
David Arinzon <darinzon@...zon.com>,
Noam Dagan <ndagan@...zon.com>,
Saeed Bishara <saeedb@...zon.com>,
Ioana Ciornei <ioana.ciornei@....com>,
Claudiu Manoil <claudiu.manoil@....com>,
Tony Nguyen <anthony.l.nguyen@...el.com>,
Thomas Petazzoni <thomas.petazzoni@...tlin.com>,
Marcin Wojtas <mw@...ihalf.com>,
Russell King <linux@...linux.org.uk>,
Saeed Mahameed <saeedm@...dia.com>,
Leon Romanovsky <leon@...nel.org>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Jesper Dangaard Brouer <hawk@...nel.org>,
Toke Høiland-Jørgensen <toke@...hat.com>,
John Fastabend <john.fastabend@...il.com>,
Edward Cree <ecree.xilinx@...il.com>,
Martin Habets <habetsm.xilinx@...il.com>,
"Michael S. Tsirkin" <mst@...hat.com>,
Jason Wang <jasowang@...hat.com>,
Andrii Nakryiko <andrii@...nel.org>,
Martin KaFai Lau <kafai@...com>,
Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
KP Singh <kpsingh@...nel.org>,
Lorenzo Bianconi <lorenzo@...nel.org>,
Yajun Deng <yajun.deng@...ux.dev>,
Sergey Ryazanov <ryazanov.s.a@...il.com>,
David Ahern <dsahern@...nel.org>,
Andrei Vagin <avagin@...il.com>,
Johannes Berg <johannes.berg@...el.com>,
Vladimir Oltean <vladimir.oltean@....com>,
Cong Wang <cong.wang@...edance.com>, netdev@...r.kernel.org,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-rdma@...r.kernel.org, bpf@...r.kernel.org,
virtualization@...ts.linux-foundation.org
Subject: Re: [PATCH v2 net-next 00/26] net: introduce and use generic XDP stats
From: Alexander Lobakin <alexandr.lobakin@...el.com>
Date: Tue, 23 Nov 2021 17:39:29 +0100
Ok, open questions:
1. Channels vs queues vs global.
Jakub: no per-channel.
David (Ahern): it's worth it to separate as Rx/Tx.
Toke is fine with globals at the end I think?
My point was that for most of the systems we have 1:1 Rx:Tx
(usually num_online_cpus()), so asking drivers separately for
the number of RQs and then SQs would end up asking for the same
number twice.
But the main reason TBH was that most of the drivers store stats
on a per-channel basis and I didn't want them to regress in
functionality. I'm fine with reporting only netdev-wide if
everyone are.
In case if we keep per-channel: report per-channel only by request
and cumulative globals by default to not flood the output?
2. Count all errors as "drops" vs separately.
Daniel: account everything as drops, plus errors should be
reported as exceptions for tracing sub.
Jesper: we shouldn't mix drops and errors.
My point: we shouldn't, that's why there are patches for 2 drivers
to give errors a separate counter.
I provided an option either to report all errors together ('errors'
in stats structure) or to provide individual counters for each of
them (sonamed ctrs), but personally prefer detailed errors. However,
they might "go detailed" under trace_xdp_exception() only, sound
fine (OTOH in RTNL stats we have both "general" errors and detailed
error counters).
3. XDP and XSK ctrs separately or not.
My PoV is that those are two quite different worlds.
However, stats for actions on XSK really make a little sense since
99% of time we have xskmap redirect. So I think it'd be fine to just
expand stats structure with xsk_{rx,tx}_{packets,bytes} and count
the rest (actions, errors) together with XDP.
Rest:
- don't create a separate `ip` command and report under `-s`;
- save some RTNL skb space by skipping zeroed counters.
Also, regarding that I count all on the stack and then add to the
storage once in a polling cycle -- most drivers don't do that and
just increment the values in the storage directly, but this can be
less performant for frequently updated stats (or it's just my
embedded past).
Re u64 vs u64_stats_t -- the latter is more universal and
architecture-friendly, the former is used directly in most of the
drivers primarily because those drivers and the corresponding HW
are being run on 64-bit systems in the vast majority of cases, and
Ethtools stats themselves are not so critical to guard them with
anti-tearing. Anyways, local64_t is cheap on ARM64/x86_64 I guess?
Thanks,
Al
Powered by blists - more mailing lists