lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <7a1a6a34-019a-3ecc-3aee-1a92d29eb6e9@linux.alibaba.com>
Date:   Wed, 29 Nov 2023 17:24:07 +0800
From:   yaoma <yaoma@...ux.alibaba.com>
To:     Sagi Grimberg <sagi@...mberg.me>
Cc:     axboe@...nel.dk, hch@....de, linux-nvme@...ts.infradead.org,
        linux-kernel@...r.kernel.org, kanie@...ux.alibaba.com,
        Keith Busch <kbusch@...nel.org>, yaoma@...ux.alibaba.com
Subject: Re: [PATCH] nvme: fix deadlock between reset and scan

Hi, Sagi Grimberg

I revised my code following your advice and carried out tests.

Test Scripts:
	for ((t=1;t<=128;t++))
	do
     	nsid=`nvme create-ns /dev/nvme0 -s 1453772 -c 1453772 -f 0\
	-m 0 -d 0 | awk -F:  '{print($NF);}'`
     	nvme attach-ns /dev/nvme0 -n $nsid -c 0
	done

	echo "resetting"
	nvme reset /dev/nvme0
	lsblk | grep nvme0 | wc -l
	sleep 2
	lsblk | grep nvme0 | wc -l

Results:
	...
	attach-ns: Success, nsid:128
	resetting
	23
	128

After the fix, we will not be deadlocked.

I find a minor issue. In the resetting state, the scan may not recognize 
all ns, but since a scan work is queued at the end of reset, so the 
impact is not significant. After the reset is completed, all ns can 
eventually be recognized.

---
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 21783aa2e..e361aba39 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3630,6 +3630,10 @@ static void nvme_alloc_ns(struct nvme_ctrl *ctrl, 
struct nvme_ns_info *info)
                 goto out_unlink_ns;

         down_write(&ctrl->namespaces_rwsem);
+       if (test_bit(NVME_CTRL_FROZEN, &ctrl->flags)) {
+               up_write(&ctrl->namespaces_rwsem);
+               goto out_unlink_ns;
+       }
         nvme_ns_add_to_ctrl_list(ns);
         up_write(&ctrl->namespaces_rwsem);
         nvme_get_ctrl(ctrl);
@@ -4539,6 +4543,7 @@ void nvme_unfreeze(struct nvme_ctrl *ctrl)
         list_for_each_entry(ns, &ctrl->namespaces, list)
                 blk_mq_unfreeze_queue(ns->queue);
         up_read(&ctrl->namespaces_rwsem);
+       clear_bit(NVME_CTRL_FROZEN, &ctrl->flags);
  }
  EXPORT_SYMBOL_GPL(nvme_unfreeze);

@@ -4572,6 +4577,7 @@ void nvme_start_freeze(struct nvme_ctrl *ctrl)
  {
         struct nvme_ns *ns;

+       set_bit(NVME_CTRL_FROZEN, &ctrl->flags);
         down_read(&ctrl->namespaces_rwsem);
         list_for_each_entry(ns, &ctrl->namespaces, list)
                 blk_freeze_queue_start(ns->queue);
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index f35647c47..755319b0d 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -251,6 +251,7 @@ enum nvme_ctrl_flags {
         NVME_CTRL_STOPPED               = 3,
         NVME_CTRL_SKIP_ID_CNS_CS        = 4,
         NVME_CTRL_DIRTY_CAPABILITY      = 5,
+       NVME_CTRL_FROZEN                = 6,
  };

  struct nvme_ctrl {
--

On 2023/11/28 18:13, Sagi Grimberg wrote:
> 
> 
> On 11/28/23 08:22, yaoma wrote:
>> Hi Keith Busch
>>
>> Thanks for your reply.
>>
>> The idea to avoid such a deadlock between nvme_reset and nvme_scan is 
>> to ensure that no namespace can be added to ctrl->namespaces after 
>> nvme_start_freeze has already been called. We can achieve this goal by 
>> assessing the ctrl->state after we have already acquired the 
>> ctrl->namespaces_rwsem lock, to decide whether to add the namespace to 
>> the list or not.
>> 1. After we determine that ctrl->state is LIVE, it may be immediately 
>> changed to another state. However, since we have already acquired the 
>> lock, other tasks cannot access ctrl->namespace, so we can still 
>> safely add the namespace to the list. After acquiring the lock, 
>> nvme_start_freeze will freeze all ns->q in the list, including any 
>> newly added namespaces.
>> 2. Before the completion of nvme_reset, ctrl->state will not be 
>> changed to LIVE, so we will not add any more namespaces to the list. 
>> All ns->q in the list is frozen, so nvme_wait_freeze can exit normally.
> 
> I agree with the analysis, there is a hole between start_freeze and
> freeze_wait that a scan may add a ns to the ctrl ns list.
> 
> However the fix should be to mark the ctrl with say NVME_CTRL_FROZEN
> flag set in nvme_freeze_start and cleared in nvme_unfreeze (similar
> to what we did with quiesce). Then the scan can check it before adding
> the new namespace (under the namespaces_rwsem).

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ