[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1710154125.7529383-1-xuanzhuo@linux.alibaba.com>
Date: Mon, 11 Mar 2024 18:48:45 +0800
From: Xuan Zhuo <xuanzhuo@...ux.alibaba.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: netdev@...r.kernel.org,
"Michael S. Tsirkin" <mst@...hat.com>,
Jason Wang <jasowang@...hat.com>,
"David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Paolo Abeni <pabeni@...hat.com>,
virtualization@...ts.linux.dev,
Willem de Bruijn <willemdebruijn.kernel@...il.com>,
Tariq Toukan <tariqt@...dia.com>,
Michael Chan <michael.chan@...adcom.com>,
Jesse Brandeburg <jesse.brandeburg@...el.com>,
Alexander Lobakin <aleksander.lobakin@...el.com>,
Shannon Nelson <shannon.nelson@....com>
Subject: Re: [PATCH net-next v3 3/6] virtio_net: support device stats
On Thu, 7 Mar 2024 08:50:21 -0800, Jakub Kicinski <kuba@...nel.org> wrote:
> CC: Willem and some driver folks for more input, context: extending
> https://lore.kernel.org/all/20240306195509.1502746-1-kuba@kernel.org/
> to cover virtio stats.
>
> On Tue, 27 Feb 2024 16:03:00 +0800 Xuan Zhuo wrote:
> > +static const struct virtnet_stat_desc virtnet_stats_rx_basic_desc[] = {
> > + VIRTNET_STATS_DESC(rx, basic, packets),
> > + VIRTNET_STATS_DESC(rx, basic, bytes),
>
> Covered.
About "packets" and "bytes", here is coming from the hw device.
Actually the driver also count "packets" and "bytes" in SW.
So there are HW and SW versions. Do we need to distinguish them?
>
> > + VIRTNET_STATS_DESC(rx, basic, notifications),
> > + VIRTNET_STATS_DESC(rx, basic, interrupts),
>
> I haven't seen HW devices count interrupts coming from a specific
> queue (there's usually a lot more queues than IRQs these days),
> let's keep these in ethtool -S for now, unless someone has a HW use
> case.
OK.
>
> > + VIRTNET_STATS_DESC(rx, basic, drops),
> > + VIRTNET_STATS_DESC(rx, basic, drop_overruns),
>
> These are important, but we need to make sure we have a good definition
> for vendors to follow...
>
> drops I'd define as "sum of all packets which came into the device, but
> never left it, including but not limited to: packets dropped due to
> lack of buffer space, processing errors, explicitly set policies and
> packet filters."
> Call it hw-rx-drops ?
I agree.
>
> overruns is a bit harder to precisely define. I was thinking of
> something more broad, like: "packets dropped due to transient lack of
> resources, such as buffer space, host descriptors etc."
>
> For context why not just go with virtio spec definition of "no
> descriptors" - for HW devices, what exact point in the pipeline drops
> depends on how back pressure is configured/implemented, and fetching
> descriptors is high latency, so differentiating between "PCIe is slow"
> and "host didn't post descriptors" is hard in practice.
> Call it hw-rx-drop-overruns ?
OK.
>
> > +static const struct virtnet_stat_desc virtnet_stats_tx_basic_desc[] = {
> > + VIRTNET_STATS_DESC(tx, basic, packets),
> > + VIRTNET_STATS_DESC(tx, basic, bytes),
> > +
> > + VIRTNET_STATS_DESC(tx, basic, notifications),
> > + VIRTNET_STATS_DESC(tx, basic, interrupts),
> > +
> > + VIRTNET_STATS_DESC(tx, basic, drops),
>
> These 5 same as rx.
>
> > + VIRTNET_STATS_DESC(tx, basic, drop_malformed),
>
> These I'd call hw-tx-drop-errors, "packets dropped because they were
> invalid or malformed"?
OK.
>
> > +static const struct virtnet_stat_desc virtnet_stats_rx_csum_desc[] = {
> > + VIRTNET_STATS_DESC(rx, csum, csum_valid),
>
> I think in kernel parlance that would translate to CHECKSUM_UNNECESSARY?
> So let's call it rx-csum-unnecessary ?
> I'd skip the hw- prefix for this one, it doesn't matter to the user if
> the HW or SW counted it.
OK.
>
> > + VIRTNET_STATS_DESC(rx, csum, needs_csum),
>
> Hm, I think this is a bit software/virt device specific, presumably
> rx-csum-partial for the kernel, up to you whether to make it ethtool -S
> or netlink.
YES. This is specific for virt device.
I will make it ethtool -S. So somebody has other advice.
>
> > + VIRTNET_STATS_DESC(rx, csum, csum_none),
> > + VIRTNET_STATS_DESC(rx, csum, csum_bad),
>
> These two make sense as is in netlink, should be fairly commonly
> reported by devices. Maybe add a note in "bad" that packets with
> bad csum are not discarded, but still delivered to the stack.
OK.
>
> > +static const struct virtnet_stat_desc virtnet_stats_tx_csum_desc[] = {
> > + VIRTNET_STATS_DESC(tx, csum, needs_csum),
> > + VIRTNET_STATS_DESC(tx, csum, csum_none),
>
> tx- version of what names we pick for rx-, netlink seems appropriate.
>
> > +static const struct virtnet_stat_desc virtnet_stats_rx_gso_desc[] = {
> > + VIRTNET_STATS_DESC(rx, gso, gso_packets),
> > + VIRTNET_STATS_DESC(rx, gso, gso_bytes),
>
> I used the term "GSO" in conversations about Rx and it often confuses
> people. Let's use "GRO", so hw-gro-packets, and hw-gro-bytes ?
> Or maybe coalesce? "hw-rx-coalesce" ? That's quite a bit longer..
GRO may also confuse people.
I like hw-rx-coalesce-packets, hw-rx-coalesce-bytes.
>
> Ah, and please mention in the doc that these counters "do not cover LRO
> i.e. any coalescing implementation which doesn't follow GRO rules".
OK.
>
> > + VIRTNET_STATS_DESC(rx, gso, gso_packets_coalesced),
>
> hw-gro-wire-packets ?
> No strong preference on the naming, but I find that saying -wire
> makes it 100% clear to everyone what the meaning is.
ok.
>
> > + VIRTNET_STATS_DESC(rx, gso, gso_bytes_coalesced),
>
> The documentation in the virtio spec seems to be identical
> to the one for gso_packets, which gotta be unintentional?
One for num, one for bytes.
> I'm guessing this is hw-gro-wire-bytes? I.e. headers counted
> multiple times?
This is used to count the bytes of the small packets before coalescing.
> > +static const struct virtnet_stat_desc virtnet_stats_tx_gso_desc[] = {
> > + VIRTNET_STATS_DESC(tx, gso, gso_packets),
> > + VIRTNET_STATS_DESC(tx, gso, gso_bytes),
> > + VIRTNET_STATS_DESC(tx, gso, gso_segments),
> > + VIRTNET_STATS_DESC(tx, gso, gso_segments_bytes),
>
> these 4 make sense as mirror of the Rx
>
> > + VIRTNET_STATS_DESC(tx, gso, gso_packets_noseg),
> > + VIRTNET_STATS_DESC(tx, gso, gso_bytes_noseg),
>
> Not sure what these are :) unless someone knows what it is and that
> HW devices report it, let's keep them in ethtool -S ?
Just for the virtio. Let's keep them in ethtool -S.
>
> > +static const struct virtnet_stat_desc virtnet_stats_rx_speed_desc[] = {
> > + VIRTNET_STATS_DESC(rx, speed, packets_allowance_exceeded),
>
> hw-rx-drop-ratelimits ?
> "Allowance exceeded" is a bit of a mouthful to me, perhaps others
> disagree. The description from the virtio spec is quite good.
OK.
>
> > + VIRTNET_STATS_DESC(rx, speed, bytes_allowance_exceeded),
>
> No strong preference whether to expose this as a standard stat or
> ethtool -S, we don't generally keep byte counters for drops, so
> this would be special.
OK.
>
> > +static const struct virtnet_stat_desc virtnet_stats_tx_speed_desc[] = {
> > + VIRTNET_STATS_DESC(tx, speed, packets_allowance_exceeded),
> > + VIRTNET_STATS_DESC(tx, speed, packets_allowance_exceeded),
>
> same as rx
Thanks.
Powered by blists - more mailing lists