lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <Z4ZewoBHkHyNuXT5@home.paul.comp>
Date: Tue, 14 Jan 2025 15:55:30 +0300
From: Paul Fertser <fercerpav@...il.com>
To: Potin Lai <potin.lai.pt@...il.com>
Cc: Samuel Mendoza-Jonas <sam@...dozajonas.com>,
        "David S. Miller" <davem@...emloft.net>,
        Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
        Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>,
        Ivan Mikhaylov <fr0st61te@...il.com>,
        Patrick Williams <patrick@...cx.xyz>, netdev@...r.kernel.org,
        linux-kernel@...r.kernel.org, Cosmo Chou <cosmo.chou@...ntatw.com>,
        Potin Lai <potin.lai@...ntatw.com>, Cosmo Chou <chou.cosmo@...il.com>
Subject: Re: [PATCH v3 2/2] net/ncsi: fix state race during channel probe
 completion

Hello,

On Mon, Jan 13, 2025 at 10:34:48AM +0800, Potin Lai wrote:
> During channel probing, the last NCSI_PKT_CMD_DP command can trigger
> an unnecessary schedule_work() via ncsi_free_request(). We observed
> that subsequent config states were triggered before the scheduled
> work completed, causing potential state handling issues.

Please let's not make this whole NC-SI story even less comprehensible
than it already is. From this commit message I was unable to
understand what exactly is racing with what and under which
conditions. "Can trigger" would imply that it does not always trigger
that wrong state transition but that would also mean there's a set of
conditions that is necessary for bug to happen.

> Fix this by clearing req_flags when processing the last package.

After reading the code for a few hours I can probably see how lack of
proper processing of the response to the last "Package Deselect" call
can mix up the states.

How about this diff instead (tested on Tioga Pass but there we didn't
have any issues in the first place)?

diff --git a/net/ncsi/ncsi-manage.c b/net/ncsi/ncsi-manage.c
index bf276eaf9330..7891a537bddd 100644
--- a/net/ncsi/ncsi-manage.c
+++ b/net/ncsi/ncsi-manage.c
@@ -1385,6 +1385,12 @@ static void ncsi_probe_channel(struct ncsi_dev_priv *ndp)
 		nd->state = ncsi_dev_state_probe_package;
 		break;
 	case ncsi_dev_state_probe_package:
+		if (ndp->package_probe_id >= 8) {
+			/* Last package probed, finishing */
+			ndp->flags |= NCSI_DEV_PROBED;
+			break;
+		}
+
 		ndp->pending_req_num = 1;
 
 		nca.type = NCSI_PKT_CMD_SP;
@@ -1501,13 +1507,8 @@ static void ncsi_probe_channel(struct ncsi_dev_priv *ndp)
 		if (ret)
 			goto error;
 
-		/* Probe next package */
+		/* Probe next package after receiving response */
 		ndp->package_probe_id++;
-		if (ndp->package_probe_id >= 8) {
-			/* Probe finished */
-			ndp->flags |= NCSI_DEV_PROBED;
-			break;
-		}
 		nd->state = ncsi_dev_state_probe_package;
 		ndp->active_package = NULL;
 		break;

-- 
Be free, use free (http://www.gnu.org/philosophy/free-sw.html) software!
mailto:fercerpav@...il.com

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ