[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <871r2x8vor.fsf@toke.dk>
Date: Tue, 30 Nov 2021 17:17:24 +0100
From: Toke Høiland-Jørgensen <toke@...hat.com>
To: Alexander Lobakin <alexandr.lobakin@...el.com>,
"David S. Miller" <davem@...emloft.net>,
Jakub Kicinski <kuba@...nel.org>
Cc: Alexander Lobakin <alexandr.lobakin@...el.com>,
Jesse Brandeburg <jesse.brandeburg@...el.com>,
Michal Swiatkowski <michal.swiatkowski@...ux.intel.com>,
Maciej Fijalkowski <maciej.fijalkowski@...el.com>,
Jonathan Corbet <corbet@....net>,
Shay Agroskin <shayagr@...zon.com>,
Arthur Kiyanovski <akiyano@...zon.com>,
David Arinzon <darinzon@...zon.com>,
Noam Dagan <ndagan@...zon.com>,
Saeed Bishara <saeedb@...zon.com>,
Ioana Ciornei <ioana.ciornei@....com>,
Claudiu Manoil <claudiu.manoil@....com>,
Tony Nguyen <anthony.l.nguyen@...el.com>,
Thomas Petazzoni <thomas.petazzoni@...tlin.com>,
Marcin Wojtas <mw@...ihalf.com>,
Russell King <linux@...linux.org.uk>,
Saeed Mahameed <saeedm@...dia.com>,
Leon Romanovsky <leon@...nel.org>,
Alexei Starovoitov <ast@...nel.org>,
Daniel Borkmann <daniel@...earbox.net>,
Jesper Dangaard Brouer <hawk@...nel.org>,
John Fastabend <john.fastabend@...il.com>,
Edward Cree <ecree.xilinx@...il.com>,
Martin Habets <habetsm.xilinx@...il.com>,
"Michael S. Tsirkin" <mst@...hat.com>,
Jason Wang <jasowang@...hat.com>,
Andrii Nakryiko <andrii@...nel.org>,
Martin KaFai Lau <kafai@...com>,
Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>,
KP Singh <kpsingh@...nel.org>,
Lorenzo Bianconi <lorenzo@...nel.org>,
Yajun Deng <yajun.deng@...ux.dev>,
Sergey Ryazanov <ryazanov.s.a@...il.com>,
David Ahern <dsahern@...nel.org>,
Andrei Vagin <avagin@...il.com>,
Johannes Berg <johannes.berg@...el.com>,
Vladimir Oltean <vladimir.oltean@....com>,
Cong Wang <cong.wang@...edance.com>, netdev@...r.kernel.org,
linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-rdma@...r.kernel.org, bpf@...r.kernel.org,
virtualization@...ts.linux-foundation.org
Subject: Re: [PATCH v2 net-next 00/26] net: introduce and use generic XDP stats
Alexander Lobakin <alexandr.lobakin@...el.com> writes:
> From: Alexander Lobakin <alexandr.lobakin@...el.com>
> Date: Tue, 23 Nov 2021 17:39:29 +0100
>
> Ok, open questions:
>
> 1. Channels vs queues vs global.
>
> Jakub: no per-channel.
> David (Ahern): it's worth it to separate as Rx/Tx.
> Toke is fine with globals at the end I think?
Well, I don't like throwing data away, so in that sense I do like
per-queue stats, but it's not a very strong preference (i.e., I can live
with either)...
> My point was that for most of the systems we have 1:1 Rx:Tx
> (usually num_online_cpus()), so asking drivers separately for
> the number of RQs and then SQs would end up asking for the same
> number twice.
> But the main reason TBH was that most of the drivers store stats
> on a per-channel basis and I didn't want them to regress in
> functionality. I'm fine with reporting only netdev-wide if
> everyone are.
>
> In case if we keep per-channel: report per-channel only by request
> and cumulative globals by default to not flood the output?
... however if we do go with per-channel stats I do agree that they
shouldn't be in the default output. I guess netlink could still split
them out and iproute2 could just sum them before display?
> 2. Count all errors as "drops" vs separately.
>
> Daniel: account everything as drops, plus errors should be
> reported as exceptions for tracing sub.
> Jesper: we shouldn't mix drops and errors.
>
> My point: we shouldn't, that's why there are patches for 2 drivers
> to give errors a separate counter.
> I provided an option either to report all errors together ('errors'
> in stats structure) or to provide individual counters for each of
> them (sonamed ctrs), but personally prefer detailed errors. However,
> they might "go detailed" under trace_xdp_exception() only, sound
> fine (OTOH in RTNL stats we have both "general" errors and detailed
> error counters).
I agree it would be nice to have a separate error counter, but a single
counter is enough when combined with the tracepoints.
> 3. XDP and XSK ctrs separately or not.
>
> My PoV is that those are two quite different worlds.
> However, stats for actions on XSK really make a little sense since
> 99% of time we have xskmap redirect. So I think it'd be fine to just
> expand stats structure with xsk_{rx,tx}_{packets,bytes} and count
> the rest (actions, errors) together with XDP.
A whole set of separate counters for XSK is certainly overkill. No
strong preference as to whether they need a separate counter at all...
> Rest:
> - don't create a separate `ip` command and report under `-s`;
> - save some RTNL skb space by skipping zeroed counters.
>
> Also, regarding that I count all on the stack and then add to the
> storage once in a polling cycle -- most drivers don't do that and
> just increment the values in the storage directly, but this can be
> less performant for frequently updated stats (or it's just my
> embedded past).
> Re u64 vs u64_stats_t -- the latter is more universal and
> architecture-friendly, the former is used directly in most of the
> drivers primarily because those drivers and the corresponding HW
> are being run on 64-bit systems in the vast majority of cases, and
> Ethtools stats themselves are not so critical to guard them with
> anti-tearing. Anyways, local64_t is cheap on ARM64/x86_64 I guess?
I'm generally a fan of correctness first, so since you're touching all
the drivers anyway why I'd say go for u64_stats_t :)
-Toke
Powered by blists - more mailing lists