[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Ziw8OSchaOaph1i8@LQ3V64L9R2>
Date: Fri, 26 Apr 2024 16:43:53 -0700
From: Joe Damato <jdamato@...tly.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: linux-kernel@...r.kernel.org, netdev@...r.kernel.org, tariqt@...dia.com,
saeedm@...dia.com, mkarsten@...terloo.ca, gal@...dia.com,
nalramli@...tly.com, "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
"open list:MELLANOX MLX4 core VPI driver" <linux-rdma@...r.kernel.org>
Subject: Re: [PATCH net-next v2 1/3] net/mlx4: Track RX allocation failures
in a stat
On Fri, Apr 26, 2024 at 01:00:17PM -0700, Jakub Kicinski wrote:
> On Fri, 26 Apr 2024 18:33:53 +0000 Joe Damato wrote:
> > --- a/drivers/net/ethernet/mellanox/mlx4/en_port.c
> > +++ b/drivers/net/ethernet/mellanox/mlx4/en_port.c
> > @@ -151,7 +151,7 @@ void mlx4_en_fold_software_stats(struct net_device *dev)
> > {
> > struct mlx4_en_priv *priv = netdev_priv(dev);
> > struct mlx4_en_dev *mdev = priv->mdev;
> > - unsigned long packets, bytes;
> > + unsigned long packets, bytes, dropped;
> > int i;
> >
> > if (!priv->port_up || mlx4_is_master(mdev->dev))
> > @@ -159,14 +159,17 @@ void mlx4_en_fold_software_stats(struct net_device *dev)
> >
> > packets = 0;
> > bytes = 0;
> > + dropped = 0;
> > for (i = 0; i < priv->rx_ring_num; i++) {
> > const struct mlx4_en_rx_ring *ring = priv->rx_ring[i];
> >
> > packets += READ_ONCE(ring->packets);
> > bytes += READ_ONCE(ring->bytes);
> > + dropped += READ_ONCE(ring->dropped);
> > }
> > dev->stats.rx_packets = packets;
> > dev->stats.rx_bytes = bytes;
> > + dev->stats.rx_missed_errors = dropped;
>
> I'd drop this chunk, there's a slight but meaningful difference in
> definition of rx_missed vs alloc-fail:
>
> * @rx_missed_errors: Count of packets missed by the host.
> * Folded into the "drop" counter in `/proc/net/dev`.
> *
> * Counts number of packets dropped by the device due to lack
> * of buffer space. This usually indicates that the host interface
> * is slower than the network interface, or host is not keeping up
> * with the receive packet rate.
> ---
> name: rx-alloc-fail
> doc: |
> Number of times skb or buffer allocation failed on the Rx datapath.
> Allocation failure may, or may not result in a packet drop, depending
> on driver implementation and whether system recovers quickly.
>
> tl;dr "packets dropped" vs "may, or may not result in a packet drop"
>
> In case of mlx4 looks like the buffer refill is "async", the driver
> tries to refill the buffers to max, but if it fails the next NAPI poll
> will try again. Allocation failures are not directly tied to packet
> drops. In case of bnxt if "replacement" buffer can't be allocated -
> packet is dropped and old buffer gets returned to the ring (although
> if I'm 100% honest bnxt may be off by a couple, too, as the OOM stat
> gets incremented on ifup pre-fill failures).
Yes, I see that now. I'll drop this patch entirely from v3 and just leave
the other two and remove alloc_fail from the queue stats patch.
Thanks for the careful review.
Powered by blists - more mailing lists