[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <03cbd9c4-0f11-895b-8eb5-1b75bb74d37c@huawei.com>
Date: Fri, 10 Dec 2021 18:04:53 +0000
From: John Garry <john.garry@...wei.com>
To: Robin Murphy <robin.murphy@....com>, <joro@...tes.org>,
<will@...nel.org>
CC: <iommu@...ts.linux-foundation.org>,
<suravee.suthikulpanit@....com>, <baolu.lu@...ux.intel.com>,
<willy@...radead.org>, <linux-kernel@...r.kernel.org>,
<linux-mm@...ck.org>, Xiongfeng Wang <wangxiongfeng2@...wei.com>
Subject: Re: [PATCH v2 01/11] iommu/iova: Fix race between FQ timeout and
teardown
On 10/12/2021 17:54, Robin Murphy wrote:
> From: Xiongfeng Wang<wangxiongfeng2@...wei.com>
>
> It turns out to be possible for hotplugging out a device to reach the
> stage of tearing down the device's group and default domain before the
> domain's flush queue has drained naturally. At this point, it is then
> possible for the timeout to expire just*before* the del_timer() call
super nit: "just*before* the" - needs a whitespace before "before" :)
> from free_iova_flush_queue(), such that we then proceed to free the FQ
> resources while fq_flush_timeout() is still accessing them on another
> CPU. Crashes due to this have been observed in the wild while removing
> NVMe devices.
>
> Close the race window by using del_timer_sync() to safely wait for any
> active timeout handler to finish before we start to free things. We
> already avoid any locking in free_iova_flush_queue() since the FQ is
> supposed to be inactive anyway, so the potential deadlock scenario does
> not apply.
>
> Fixes: 9a005a800ae8 ("iommu/iova: Add flush timer")
> Signed-off-by: Xiongfeng Wang<wangxiongfeng2@...wei.com>
> [ rm: rewrite commit message ]
> Signed-off-by: Robin Murphy<robin.murphy@....com>
FWIW,
Reviewed-by: John Garry <john.garry@...wei.com>
Powered by blists - more mailing lists