[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a673f379-9b0c-4d02-8884-23c62930513a@arista.com>
Date: Fri, 31 Oct 2025 10:43:36 -0700
From: Christoph Schwarz <cschwarz@...sta.com>
To: Eric Dumazet <edumazet@...gle.com>
Cc: Neal Cardwell <ncardwell@...gle.com>, netdev@...r.kernel.org
Subject: Re: TCP sender stuck despite receiving ACKs from the peer
On 10/31/25 02:06, Eric Dumazet wrote:
> On Thu, Oct 23, 2025 at 10:57 PM Eric Dumazet <edumazet@...gle.com> wrote:
>>
[...]
>> Could you try the following patch ?
>>
>> Thanks again !
>>
>> diff --git a/net/core/dev.c b/net/core/dev.c
>> index 378c2d010faf251ffd874ebf0cc3dd6968eee447..8efda845611129920a9ae21d5e9dd05ffab36103
>> 100644
>> --- a/net/core/dev.c
>> +++ b/net/core/dev.c
>> @@ -4796,6 +4796,8 @@ int __dev_queue_xmit(struct sk_buff *skb, struct
>> net_device *sb_dev)
>> * to -1 or to their cpu id, but not to our id.
>> */
>> if (READ_ONCE(txq->xmit_lock_owner) != cpu) {
>> + struct sk_buff *orig;
>> +
>> if (dev_xmit_recursion())
>> goto recursion_alert;
>>
>> @@ -4805,6 +4807,7 @@ int __dev_queue_xmit(struct sk_buff *skb, struct
>> net_device *sb_dev)
>>
>> HARD_TX_LOCK(dev, txq, cpu);
>>
>> + orig = skb;
>> if (!netif_xmit_stopped(txq)) {
>> dev_xmit_recursion_inc();
>> skb = dev_hard_start_xmit(skb, dev, txq, &rc);
>> @@ -4817,6 +4820,11 @@ int __dev_queue_xmit(struct sk_buff *skb,
>> struct net_device *sb_dev)
>> HARD_TX_UNLOCK(dev, txq);
>> net_crit_ratelimited("Virtual device %s asks
>> to queue packet!\n",
>> dev->name);
>> + if (skb != orig) {
>> + /* If at least one packet was sent, we
>> must return NETDEV_TX_OK */
>> + rc = NETDEV_TX_OK;
>> + goto unlock;
>> + }
>> } else {
>> /* Recursion is detected! It is possible,
>> * unfortunately
>> @@ -4828,6 +4836,7 @@ int __dev_queue_xmit(struct sk_buff *skb, struct
>> net_device *sb_dev)
>> }
>>
>> rc = -ENETDOWN;
>> +unlock:
>> rcu_read_unlock_bh();
>>
>> dev_core_stats_tx_dropped_inc(dev);
>
> Hi Christoph
>
> Any progress on your side ?
>
> Thanks.
Hi Eric,
Thanks for your help. This is much appreciated.
We tried your patch but unfortunately it did not help. We have some
ideas why that is. Here is what we figured out:
It is very likely that device stacking as described in my previous mail
is a factor.
49: vlan0@...ent: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc
noqueue state UP mode DEFAULT group default qlen 1000
link/ether 02:1c:a7:00:00:01 brd ff:ff:ff:ff:ff:ff
3: parent: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 10000 qdisc prio state
UNKNOWN mode DEFAULT group default qlen 1000
link/ether xx:xx:xx:xx:xx:xx brd ff:ff:ff:ff:ff:ff
The "parent" device is served by a proprietary device driver for a
switch ASIC, and implements TX flow control, with the TX queue being
stopped frequently. It does not have TSO capabilities. We could look
into adding that, but as of now it is not an option.
The "vlan0" device stacked on top is Linux kernel code
(net/8021q/vlan_dev.c) and has the IP address to which the HTTP server
binds. However, its TX queue never stops.
So now it can get into this situation where the TX queue on the
underlying device is stopped, but on the stacked vlan0 device it is not.
In this situation, we see return codes of NET_XMIT_DROP (1).
Which means it never reaches the code that you patched in, because
thanks to rc=1, dev_xmit_complete is always true so it goes to out. And
because the TX queue on vlan0 is never stopped, it always enters the
"!netif_xmit_stopped(txq)" block and never skips over it, again
preventing the new code from ever being executed.
if (!netif_xmit_stopped(txq)) {
dev_xmit_recursion_inc();
skb = dev_hard_start_xmit(skb, dev, txq, &rc);
dev_xmit_recursion_dec();
if (dev_xmit_complete(rc)) {
HARD_TX_UNLOCK(dev, txq);
goto out;
}
}
HARD_TX_UNLOCK(dev, txq);
net_crit_ratelimited("Virtual device %s asks to queue packet!\n",
dev->name);
if (skb != orig) {
/* If at least one packet was sent, we must return NETDEV_TX_OK */
rc = NETDEV_TX_OK;
goto unlock;
}
I think for your patch to work we would need to see a NETDEV_TX_BUSY
(0x10) rc from dev_hard_start_xmit, but that does not seem to happen,
maybe due to the device stacking?
best regards,
Chris
Powered by blists - more mailing lists