linux-kernel - Re: [PATCH v4 2/8] nvme-tcp: Update number of hardware queues before using them

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <8373c07f-f5df-1ec6-9fda-d0262fc1b377@grimberg.me>
Date:   Fri, 6 Aug 2021 12:57:17 -0700
From:   Sagi Grimberg <sagi@...mberg.me>
To:     Daniel Wagner <dwagner@...e.de>, linux-nvme@...ts.infradead.org
Cc:     linux-kernel@...r.kernel.org,
        James Smart <james.smart@...adcom.com>,
        Keith Busch <kbusch@...nel.org>,
        Ming Lei <ming.lei@...hat.com>, Hannes Reinecke <hare@...e.de>,
        Wen Xiong <wenxiong@...ibm.com>
Subject: Re: [PATCH v4 2/8] nvme-tcp: Update number of hardware queues before
 using them


> From: Hannes Reinecke <hare@...e.de>
> 
> When the number of hardware queues changes during resetting we should
> update the tagset first before using it.
> 
> Signed-off-by: Hannes Reinecke <hare@...e.de>
> Signed-off-by: Daniel Wagner <dwagner@...e.de>
> ---
>   drivers/nvme/host/tcp.c | 14 ++++++--------
>   1 file changed, 6 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
> index 0a97ba02f61e..32268f24f62a 100644
> --- a/drivers/nvme/host/tcp.c
> +++ b/drivers/nvme/host/tcp.c
> @@ -1789,6 +1789,7 @@ static void nvme_tcp_destroy_io_queues(struct nvme_ctrl *ctrl, bool remove)
>   static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
>   {
>   	int ret;
> +	u32 prior_q_cnt = ctrl->queue_count;
>   
>   	ret = nvme_tcp_alloc_io_queues(ctrl);
>   	if (ret)
> @@ -1806,14 +1807,7 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
>   			ret = PTR_ERR(ctrl->connect_q);
>   			goto out_free_tag_set;
>   		}
> -	}
> -
> -	ret = nvme_tcp_start_io_queues(ctrl);
> -	if (ret)
> -		goto out_cleanup_connect_q;
> -
> -	if (!new) {
> -		nvme_start_queues(ctrl);
> +	} else if (prior_q_cnt != ctrl->queue_count) {

So if the queue count did not change we don't wait to make sure
the queue g_usage_counter ref made it to zero? What guarantees that it
did?

>   		if (!nvme_wait_freeze_timeout(ctrl, NVME_IO_TIMEOUT)) {
>   			/*
>   			 * If we timed out waiting for freeze we are likely to
> @@ -1828,6 +1822,10 @@ static int nvme_tcp_configure_io_queues(struct nvme_ctrl *ctrl, bool new)
>   		nvme_unfreeze(ctrl);
>   	}
>   
> +	ret = nvme_tcp_start_io_queues(ctrl);
> +	if (ret)
> +		goto out_cleanup_connect_q;
> +

Did you test this with both heavy I/O, reset loop and ifdown/ifup loop?

If we unquiesce and unfreeze before we start the queues the pending I/Os
may resume before the connect and not allow the connect to make forward
progress.

>   	return 0;
>   
>   out_wait_freeze_timed_out:
>