lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c6abaa9f-cd3e-4259-bed6-5e795ff58ecd@kernel.org>
Date: Thu, 24 Apr 2025 17:24:51 +0200
From: Jesper Dangaard Brouer <hawk@...nel.org>
To: Jakub Kicinski <kuba@...nel.org>
Cc: netdev@...r.kernel.org, bpf@...r.kernel.org, tom@...bertland.com,
 Eric Dumazet <eric.dumazet@...il.com>, "David S. Miller"
 <davem@...emloft.net>, Paolo Abeni <pabeni@...hat.com>,
 Toke Høiland-Jørgensen <toke@...e.dk>,
 dsahern@...nel.org, makita.toshiaki@....ntt.co.jp,
 kernel-team@...udflare.com, phil@....cc
Subject: Re: [PATCH net-next V6 2/2] veth: apply qdisc backpressure on full
 ptr_ring to reduce TX drops



On 24/04/2025 16.23, Jakub Kicinski wrote:
> On Thu, 24 Apr 2025 14:56:49 +0200 Jesper Dangaard Brouer wrote:
>> +	case NETDEV_TX_BUSY:
>> +		/* If a qdisc is attached to our virtual device, returning
>> +		 * NETDEV_TX_BUSY is allowed.
>> +		 */
>> +		txq = netdev_get_tx_queue(dev, rxq);
>> +
>> +		if (qdisc_txq_has_no_queue(txq)) {
>> +			dev_kfree_skb_any(skb);
>> +			goto drop;
>> +		}
>> +		netif_tx_stop_queue(txq);
>> +		/* Restore Eth hdr pulled by dev_forward_skb/eth_type_trans */
>> +		__skb_push(skb, ETH_HLEN);
>> +		/* Depend on prior success packets started NAPI consumer via
>> +		 * __veth_xdp_flush(). Cancel TXQ stop if consumer stopped,
>> +		 * paired with empty check in veth_poll().
>> +		 */
>> +		if (unlikely(__ptr_ring_empty(&rq->xdp_ring)))
>> +			netif_tx_wake_queue(txq);
> 
> Looks like I wrote a reply to v5 but didn't hit send. But I may have
> set v5 to Changes Requested because of it :S Here is my comment:
> 
>   I think this is missing a memory barrier. When drivers do this dance
>   there's usually a barrier between stop and recheck, to make sure the
>   stop is visible before we check. And vice versa veth_xdp_rcv() needs
>   to make sure other side sees the "empty" indication before it checks
>   if the queue is stopped.

The call netif_tx_stop_queue(txq); already contains a memory barrier
smp_mb__before_atomic() plus an atomic set_bit operation.  That should
be sufficient.

And the other side veth_poll(), have a smp_store_mb() before reading
ptr_ring.

--Jesper

p.s.
I actually had an alternative implementation of this, that only calls
stop when it is needed.  See below, it kind of looks prettier, but it
adds an extra memory barrier in the likely path. (And I'm not sure if 
read memory barrier is strong enough).


diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 6ef24e261574..5ab352ccee38 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -390,15 +390,16 @@ static netdev_tx_t veth_xmit(struct sk_buff *skb, 
struct net_device *dev)
                         dev_kfree_skb_any(skb);
                         goto drop;
                 }
-               netif_tx_stop_queue(txq);
                 /* Restore Eth hdr pulled by 
dev_forward_skb/eth_type_trans */
                 __skb_push(skb, ETH_HLEN);
                 /* Depend on prior success packets started NAPI 
consumer via
-                * __veth_xdp_flush(). Cancel TXQ stop if consumer stopped,
-                * paired with empty check in veth_poll().
+                * __veth_xdp_flush().  Make sure consumer is still 
running and
+                * didn't completely queue, before stopping TXQ. Paired with
+                * queue check in veth_poll().
                  */
-               if (unlikely(__ptr_ring_empty(&rq->xdp_ring)))
-                       netif_tx_wake_queue(txq);
+               smp_rmb();
+               if (likely(!__ptr_ring_empty(&rq->xdp_ring)))
+                       netif_tx_stop_queue(txq);
                 break;
         case NET_RX_DROP: /* same as NET_XMIT_DROP */


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ