lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <51fc1a3f-7166-49cf-9652-a95bdc5d2119@grimberg.me>
Date: Mon, 15 Apr 2024 23:51:19 +0300
From: Sagi Grimberg <sagi@...mberg.me>
To: Daniel Wagner <dwagner@...e.de>, Christoph Hellwig <hch@....de>
Cc: Keith Busch <kbusch@...nel.org>, James Smart <james.smart@...adcom.com>,
 Hannes Reinecke <hare@...e.de>, linux-nvme@...ts.infradead.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH v6 5/5] nvme-fabrics: handle transient auth failures



On 15/04/2024 15:42, Daniel Wagner wrote:
> Retry authentication a couple of times to avoid error out on transient
> errors. E.g. if a new key is deployed on the host and the target. At the
> same time the connection drops. The host might use the wrong key to
> reconnect.
>
> Suggested-by: Sagi Grimberg <sagi@...mberg.me>
> Signed-off-by: Daniel Wagner <dwagner@...e.de>
> ---
>   drivers/nvme/host/auth.c    |  4 ++--
>   drivers/nvme/host/fabrics.c | 10 +++++++++-
>   drivers/nvme/host/fc.c      |  2 ++
>   drivers/nvme/host/nvme.h    |  1 +
>   drivers/nvme/host/rdma.c    |  1 +
>   drivers/nvme/host/tcp.c     |  1 +
>   6 files changed, 16 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/nvme/host/auth.c b/drivers/nvme/host/auth.c
> index a264b3ae078b..368dcc6ee11b 100644
> --- a/drivers/nvme/host/auth.c
> +++ b/drivers/nvme/host/auth.c
> @@ -797,7 +797,7 @@ static void nvme_queue_auth_work(struct work_struct *work)
>   					 NVME_AUTH_DHCHAP_MESSAGE_SUCCESS1);
>   	if (ret) {
>   		chap->status = ret;
> -		chap->error = -ECONNREFUSED;
> +		chap->error = -EKEYREJECTED;
>   		return;
>   	}
>   
> @@ -818,7 +818,7 @@ static void nvme_queue_auth_work(struct work_struct *work)
>   	ret = nvme_auth_process_dhchap_success1(ctrl, chap);
>   	if (ret) {
>   		/* Controller authentication failed */
> -		chap->error = -ECONNREFUSED;
> +		chap->error = -EKEYREJECTED;
>   		goto fail2;
>   	}
>   
> diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
> index d9a73b1b41c4..6dfa45dce232 100644
> --- a/drivers/nvme/host/fabrics.c
> +++ b/drivers/nvme/host/fabrics.c
> @@ -573,8 +573,16 @@ EXPORT_SYMBOL_GPL(nvmf_connect_io_queue);
>    */
>   bool nvmf_should_reconnect(struct nvme_ctrl *ctrl, int status)
>   {
> -	if (status < 0)
> +	if (status < 0) {
> +		/*
> +		 * authentication errors can be transient, thus retry a couple
> +		 * of times before giving up.
> +		 */
> +		if (status == -EKEYREJECTED &&
> +		    ++ctrl->nr_auth_retries < 3)
> +			return true;

I did not suggest nr_auth_retries. Where is the 3 coming from? The 
controller already
has a number of reconnects before it gives up, no reason to add another one.

Just don't return false based on the status if it is a transient 
authentication error.

The patch just needs to be modified from:
     if (status < 0)
to
     if (status < 0 && status != -EKEYREJECTED Plus a comment that 
explains it.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ