lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Fri, 26 Apr 2024 16:43:53 -0700
From: Joe Damato <jdamato@...tly.com>
To: Jakub Kicinski <kuba@...nel.org>
Cc: linux-kernel@...r.kernel.org, netdev@...r.kernel.org, tariqt@...dia.com,
	saeedm@...dia.com, mkarsten@...terloo.ca, gal@...dia.com,
	nalramli@...tly.com, "David S. Miller" <davem@...emloft.net>,
	Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
	"open list:MELLANOX MLX4 core VPI driver" <linux-rdma@...r.kernel.org>
Subject: Re: [PATCH net-next v2 1/3] net/mlx4: Track RX allocation failures
 in a stat

On Fri, Apr 26, 2024 at 01:00:17PM -0700, Jakub Kicinski wrote:
> On Fri, 26 Apr 2024 18:33:53 +0000 Joe Damato wrote:
> > --- a/drivers/net/ethernet/mellanox/mlx4/en_port.c
> > +++ b/drivers/net/ethernet/mellanox/mlx4/en_port.c
> > @@ -151,7 +151,7 @@ void mlx4_en_fold_software_stats(struct net_device *dev)
> >  {
> >  	struct mlx4_en_priv *priv = netdev_priv(dev);
> >  	struct mlx4_en_dev *mdev = priv->mdev;
> > -	unsigned long packets, bytes;
> > +	unsigned long packets, bytes, dropped;
> >  	int i;
> >  
> >  	if (!priv->port_up || mlx4_is_master(mdev->dev))
> > @@ -159,14 +159,17 @@ void mlx4_en_fold_software_stats(struct net_device *dev)
> >  
> >  	packets = 0;
> >  	bytes = 0;
> > +	dropped = 0;
> >  	for (i = 0; i < priv->rx_ring_num; i++) {
> >  		const struct mlx4_en_rx_ring *ring = priv->rx_ring[i];
> >  
> >  		packets += READ_ONCE(ring->packets);
> >  		bytes   += READ_ONCE(ring->bytes);
> > +		dropped += READ_ONCE(ring->dropped);
> >  	}
> >  	dev->stats.rx_packets = packets;
> >  	dev->stats.rx_bytes = bytes;
> > +	dev->stats.rx_missed_errors = dropped;
> 
> I'd drop this chunk, there's a slight but meaningful difference in
> definition of rx_missed vs alloc-fail:
> 
>  * @rx_missed_errors: Count of packets missed by the host.
>  *   Folded into the "drop" counter in `/proc/net/dev`.
>  *
>  *   Counts number of packets dropped by the device due to lack
>  *   of buffer space. This usually indicates that the host interface
>  *   is slower than the network interface, or host is not keeping up
>  *   with the receive packet rate.
> ---
>         name: rx-alloc-fail
>         doc: |
>           Number of times skb or buffer allocation failed on the Rx datapath.
>           Allocation failure may, or may not result in a packet drop, depending
>           on driver implementation and whether system recovers quickly.
> 
> tl;dr "packets dropped" vs "may, or may not result in a packet drop"
> 
> In case of mlx4 looks like the buffer refill is "async", the driver
> tries to refill the buffers to max, but if it fails the next NAPI poll
> will try again. Allocation failures are not directly tied to packet
> drops. In case of bnxt if "replacement" buffer can't be allocated -
> packet is dropped and old buffer gets returned to the ring (although 
> if I'm 100% honest bnxt may be off by a couple, too, as the OOM stat
> gets incremented on ifup pre-fill failures).

Yes, I see that now. I'll drop this patch entirely from v3 and just leave
the other two and remove alloc_fail from the queue stats patch.

Thanks for the careful review.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ