lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZP-cGgPzIX7WkNRb@kbusch-mbp.dhcp.thefacebook.com>
Date:   Mon, 11 Sep 2023 16:00:42 -0700
From:   Keith Busch <kbusch@...nel.org>
To:     Felix Yan <felixonmars@...hlinux.org>
Cc:     highenthalpyh@...il.com, linux-nvme@...ts.infradead.org,
        linux-kernel@...r.kernel.org, xuwd1@...mail.com
Subject: Re: [PATCH] nvme-pci: ignore bogus CRTO according to NVME 2.0 spec

On Fri, Sep 08, 2023 at 06:54:42PM +0300, Felix Yan wrote:
> NVME 2.0 spec section 3.1.3 suggests that "Software should not rely on
> 0h being returned". Here we should safeguard timeout reads when CRTO is 0 and
> fallback to the old NVME 1.4 compatible field.
> 
> Fixes 4TB SSD initialization issues with MAXIO MAP1602 controller, including
> Lexar NM790, AIGO P7000Z, Fanxiang S790, Acer Predator GM7, etc.
> 
> ----------
> nvme nvme1: Device not ready; aborting initialisation, CSTS=0x0
> ----------
> 
> Signed-off-by: Felix Yan <felixonmars@...hlinux.org>
> ---
>  drivers/nvme/host/core.c | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index f3a01b79148c..8ec28b1016ca 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -2255,11 +2255,17 @@ int nvme_enable_ctrl(struct nvme_ctrl *ctrl)
>  			return ret;
>  		}
>  
> -		if (ctrl->cap & NVME_CAP_CRMS_CRIMS) {
> -			ctrl->ctrl_config |= NVME_CC_CRIME;
> -			timeout = NVME_CRTO_CRIMT(crto);
> +		if (crto == 0) {
> +			timeout = NVME_CAP_TIMEOUT(ctrl->cap);
> +			dev_warn(ctrl->device, "Ignoring bogus CRTO (0), falling back to NVME_CAP_TIMEOUT (%u)\n",
> +				timeout);
>  		} else {
> -			timeout = NVME_CRTO_CRWMT(crto);
> +			if (ctrl->cap & NVME_CAP_CRMS_CRIMS) {
> +				ctrl->ctrl_config |= NVME_CC_CRIME;
> +				timeout = NVME_CRTO_CRIMT(crto);
> +			} else {
> +				timeout = NVME_CRTO_CRWMT(crto);
> +			}
>  		}
>  	} else {
>  		timeout = NVME_CAP_TIMEOUT(ctrl->cap);

What do you think about this change instead? We don't need to print a
warning on every device reset, but we should probably add a comment
explaining why this is happening.

---
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 37b6fa7466620..b4577a860e677 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -2245,6 +2245,7 @@ int nvme_enable_ctrl(struct nvme_ctrl *ctrl)
 	else
 		ctrl->ctrl_config = NVME_CC_CSS_NVM;
 
+	timeout = NVME_CAP_TIMEOUT(ctrl->cap);
 	if (ctrl->cap & NVME_CAP_CRMS_CRWMS) {
 		u32 crto;
 
@@ -2257,12 +2258,15 @@ int nvme_enable_ctrl(struct nvme_ctrl *ctrl)
 
 		if (ctrl->cap & NVME_CAP_CRMS_CRIMS) {
 			ctrl->ctrl_config |= NVME_CC_CRIME;
-			timeout = NVME_CRTO_CRIMT(crto);
+			/*
+			 * CRIMT should always be greater or equal to CAP.TO,
+			 * but some devices are known to get this wrong. Use
+			 * the larger of the two values.
+			 */
+			timeout = max(timeout, NVME_CRTO_CRIMT(crto));
 		} else {
 			timeout = NVME_CRTO_CRWMT(crto);
 		}
-	} else {
-		timeout = NVME_CAP_TIMEOUT(ctrl->cap);
 	}
 
 	ctrl->ctrl_config |= (NVME_CTRL_PAGE_SHIFT - 12) << NVME_CC_MPS_SHIFT;
--

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ