lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <877bwkfmgr.fsf@toke.dk>
Date: Fri, 24 Oct 2025 15:39:16 +0200
From: Toke Høiland-Jørgensen <toke@...hat.com>
To: Jesper Dangaard Brouer <hawk@...nel.org>, netdev@...r.kernel.org,
 makita.toshiaki@....ntt.co.jp
Cc: Jesper Dangaard Brouer <hawk@...nel.org>, Eric Dumazet
 <eric.dumazet@...il.com>, "David S. Miller" <davem@...emloft.net>, Jakub
 Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
 ihor.solodrai@...ux.dev, toshiaki.makita1@...il.com, bpf@...r.kernel.org,
 linux-kernel@...r.kernel.org, kernel-team@...udflare.com
Subject: Re: [PATCH net V1 1/3] veth: enable dev_watchdog for detecting
 stalled TXQs

Jesper Dangaard Brouer <hawk@...nel.org> writes:

> The changes introduced in commit dc82a33297fc ("veth: apply qdisc
> backpressure on full ptr_ring to reduce TX drops") have been found to cause
> a race condition in production environments.
>
> Under specific circumstances, observed exclusively on ARM64 (aarch64)
> systems with Ampere Altra Max CPUs, a transmit queue (TXQ) can become
> permanently stalled. This happens when the race condition leads to the TXQ
> entering the QUEUE_STATE_DRV_XOFF state without a corresponding queue wake-up,
> preventing the attached qdisc from dequeueing packets and causing the
> network link to halt.
>
> As a first step towards resolving this issue, this patch introduces a
> failsafe mechanism. It enables the net device watchdog by setting a timeout
> value and implements the .ndo_tx_timeout callback.
>
> If a TXQ stalls, the watchdog will trigger the veth_tx_timeout() function,
> which logs a warning and calls netif_tx_wake_queue() to unstall the queue
> and allow traffic to resume.
>
> The log message will look like this:
>
>  veth42: NETDEV WATCHDOG: CPU: 34: transmit queue 0 timed out 5393 ms
>  veth42: veth backpressure stalled(n:1) TXQ(0) re-enable
>
> This provides a necessary recovery mechanism while the underlying race
> condition is investigated further. Subsequent patches will address the root
> cause and add more robust state handling in ndo_open/ndo_stop.
>
> Fixes: dc82a33297fc ("veth: apply qdisc backpressure on full ptr_ring to reduce TX drops")
> Signed-off-by: Jesper Dangaard Brouer <hawk@...nel.org>
> ---
>  drivers/net/veth.c |   16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> index a3046142cb8e..7b1a9805b270 100644
> --- a/drivers/net/veth.c
> +++ b/drivers/net/veth.c
> @@ -959,8 +959,10 @@ static int veth_xdp_rcv(struct veth_rq *rq, int budget,
>  	rq->stats.vs.xdp_packets += done;
>  	u64_stats_update_end(&rq->stats.syncp);
>  
> -	if (peer_txq && unlikely(netif_tx_queue_stopped(peer_txq)))
> +	if (peer_txq && unlikely(netif_tx_queue_stopped(peer_txq))) {
> +		txq_trans_cond_update(peer_txq);
>  		netif_tx_wake_queue(peer_txq);
> +	}

Hmm, seems a bit weird that this call to txq_trans_cond_update() is only
in veth_xdp_recv(). Shouldn't there (also?) be one in veth_xmit()?

-Toke


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ