lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <a862beed-3361-4f78-b412-87b78095ac84@kernel.org>
Date: Mon, 27 Oct 2025 11:33:43 +0100
From: Jesper Dangaard Brouer <hawk@...nel.org>
To: Jakub Kicinski <kuba@...nel.org>, Chris Arges <carges@...udflare.com>
Cc: netdev@...r.kernel.org, makita.toshiaki@....ntt.co.jp,
 Eric Dumazet <eric.dumazet@...il.com>, "David S. Miller"
 <davem@...emloft.net>, Paolo Abeni <pabeni@...hat.com>,
 ihor.solodrai@...ux.dev, toshiaki.makita1@...il.com, bpf@...r.kernel.org,
 linux-kernel@...r.kernel.org, kernel-team@...udflare.com
Subject: Re: [PATCH net V1 2/3] veth: stop and start all TX queue in netdev
 down/up




On 25/10/2025 02.54, Jakub Kicinski wrote:
> On Thu, 23 Oct 2025 16:59:37 +0200 Jesper Dangaard Brouer wrote:
>> The veth driver started manipulating TXQ states in commit
>> dc82a33297fc ("veth: apply qdisc backpressure on full ptr_ring
>> to reduce TX drops").
>>
>> Other drivers manipulating TXQ states takes care of stopping
>> and starting TXQs in NDOs.  Thus, adding this to veth .ndo_open
>> and .ndo_stop.
> 
> Kinda, but taking a device up or down resets the qdisc, IIRC.
> 
> So stopping the qdisc for real drivers is mostly a way to make sure
> that there's nothing entering the xmit handler as the driver dismantles
> its state.
> 
> I'm not sure if this is an official rule, but I'm under the impression
> that stopping the queues or carrier loss (and
> netif_tx_stop_all_queues(peer) in close() is stopping peer's Tx queue
> on carrier loss) is inadvisable as it may lead to old packets getting
> transmitted when carrier comes back.
> 
> IOW based on the commit msg - I'm not sure this patch is needed..

During incident, when doing ip link set 'down' flushed all packets in
the qdisc, but the TXQs were not reset (started again) on link 'up'.
  Thus, the qdisc would fill-up again and block all packets on interface.
  Chris also tried to replace the qdisc, but the TXQ was still in stopped
mode QUEUE_STATE_DRV_XOFF state.

This was the origin of the patch, that we could not recover the machine
from this state.  Thus, the idea of starting all queue on link 'up',
would give us a recovery mechanism.  With dev_watchdog this change isn't
really needed.
As you mention this may lead to old packets getting transmitted when
carrier comes back, which would be a changed behavior, that we don't
want in a fixes patch.  So, I will drop this patch.

--Jesper



Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ