[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c47a0edd-7437-4c21-b7cf-f969ff85bf78@grimberg.me>
Date: Tue, 28 Nov 2023 12:13:59 +0200
From: Sagi Grimberg <sagi@...mberg.me>
To: yaoma <yaoma@...ux.alibaba.com>, Keith Busch <kbusch@...nel.org>
Cc: axboe@...nel.dk, hch@....de, linux-nvme@...ts.infradead.org,
linux-kernel@...r.kernel.org, kanie@...ux.alibaba.com
Subject: Re: [PATCH] nvme: fix deadlock between reset and scan
On 11/28/23 08:22, yaoma wrote:
> Hi Keith Busch
>
> Thanks for your reply.
>
> The idea to avoid such a deadlock between nvme_reset and nvme_scan is to
> ensure that no namespace can be added to ctrl->namespaces after
> nvme_start_freeze has already been called. We can achieve this goal by
> assessing the ctrl->state after we have already acquired the
> ctrl->namespaces_rwsem lock, to decide whether to add the namespace to
> the list or not.
> 1. After we determine that ctrl->state is LIVE, it may be immediately
> changed to another state. However, since we have already acquired the
> lock, other tasks cannot access ctrl->namespace, so we can still safely
> add the namespace to the list. After acquiring the lock,
> nvme_start_freeze will freeze all ns->q in the list, including any newly
> added namespaces.
> 2. Before the completion of nvme_reset, ctrl->state will not be changed
> to LIVE, so we will not add any more namespaces to the list. All ns->q
> in the list is frozen, so nvme_wait_freeze can exit normally.
I agree with the analysis, there is a hole between start_freeze and
freeze_wait that a scan may add a ns to the ctrl ns list.
However the fix should be to mark the ctrl with say NVME_CTRL_FROZEN
flag set in nvme_freeze_start and cleared in nvme_unfreeze (similar
to what we did with quiesce). Then the scan can check it before adding
the new namespace (under the namespaces_rwsem).
Powered by blists - more mailing lists