lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250604201938.1409219-1-hramamurthy@google.com>
Date: Wed,  4 Jun 2025 20:19:38 +0000
From: Harshitha Ramamurthy <hramamurthy@...gle.com>
To: netdev@...r.kernel.org
Cc: davem@...emloft.net, edumazet@...gle.com, kuba@...nel.org, 
	pabeni@...hat.com, jeroendb@...gle.com, hramamurthy@...gle.com, 
	andrew+netdev@...n.ch, willemb@...gle.com, pkaligineedi@...gle.com, 
	joshwash@...gle.com, thostet@...gle.com, jfraker@...gle.com, 
	awogbemila@...gle.com, linux-kernel@...r.kernel.org, stable@...r.kernel.org
Subject: [PATCH net] gve: Fix stuck TX queue for DQ queue format

From: Praveen Kaligineedi <pkaligineedi@...gle.com>

gve_tx_timeout was calculating missed completions in a way that is only
relevant in the GQ queue format. Additionally, it was attempting to
disable device interrupts, which is not needed in either GQ or DQ queue
formats.

As a result, TX timeouts with the DQ queue format likely would have
triggered early resets without kicking the queue at all.

This patch drops the check for pending work altogether and always kicks
the queue after validating the queue has not seen a TX timeout too
recently.

Fixes: 87a7f321bb6a ("gve: Recover from queue stall due to missed IRQ")
Co-developed-by: Tim Hostetler <thostet@...gle.com>
Signed-off-by: Tim Hostetler <thostet@...gle.com>
Signed-off-by: Praveen Kaligineedi <pkaligineedi@...gle.com>
Signed-off-by: Harshitha Ramamurthy <hramamurthy@...gle.com>
---
 drivers/net/ethernet/google/gve/gve_main.c | 16 ++++------------
 1 file changed, 4 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/google/gve/gve_main.c b/drivers/net/ethernet/google/gve/gve_main.c
index c3791cf..0c6328b 100644
--- a/drivers/net/ethernet/google/gve/gve_main.c
+++ b/drivers/net/ethernet/google/gve/gve_main.c
@@ -1921,7 +1921,6 @@ static void gve_tx_timeout(struct net_device *dev, unsigned int txqueue)
 	struct gve_notify_block *block;
 	struct gve_tx_ring *tx = NULL;
 	struct gve_priv *priv;
-	u32 last_nic_done;
 	u32 current_time;
 	u32 ntfy_idx;
 
@@ -1941,17 +1940,10 @@ static void gve_tx_timeout(struct net_device *dev, unsigned int txqueue)
 	if (tx->last_kick_msec + MIN_TX_TIMEOUT_GAP > current_time)
 		goto reset;
 
-	/* Check to see if there are missed completions, which will allow us to
-	 * kick the queue.
-	 */
-	last_nic_done = gve_tx_load_event_counter(priv, tx);
-	if (last_nic_done - tx->done) {
-		netdev_info(dev, "Kicking queue %d", txqueue);
-		iowrite32be(GVE_IRQ_MASK, gve_irq_doorbell(priv, block));
-		napi_schedule(&block->napi);
-		tx->last_kick_msec = current_time;
-		goto out;
-	} // Else reset.
+	netdev_info(dev, "Kicking queue %d", txqueue);
+	napi_schedule(&block->napi);
+	tx->last_kick_msec = current_time;
+	goto out;
 
 reset:
 	gve_schedule_reset(priv);
-- 
2.49.0.805.g082f7c87e0-goog


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ