[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <d1838296-0131-47ac-8a2e-eec8935c5960@quicinc.com>
Date: Mon, 20 Jan 2025 17:45:48 +0530
From: Charan Teja Kalla <quic_charante@...cinc.com>
To: Robin Murphy <robin.murphy@....com>, Will Deacon <will@...nel.org>
CC: <joro@...tes.org>, <jgg@...pe.ca>, <iommu@...ts.linux.dev>,
<linux-kernel@...r.kernel.org>
Subject: Re: [PATCH RFC] iommu: fix wrong DMA ops for iommu device
Hi Robin,
On 1/13/2025 10:55 PM, Robin Murphy wrote:
>> @Robin
>>
>> Agree that we don't have iommu_dma_ops but I can say that same race
>> still exists. Although dma_ops is not used, but the decision to call
>> into the iommu api's is determined by the 'dev->dma_iommu' flag, which
>> again, is set after domain is allocated for a device.
>>
>> In the same race mentioned above,
>> 1) S: Domain is not allocated but the dev->iommu_group.
>> 2) C: Just returns as dev->iommu_group is filled.
>> 3) C: Continues probing and succeeds.
>> 4) C: Calls dma_alloc/map/.... But, it won't enter into iommu_ api's
>> because the 'dev->dma_iommu' is still 'false'.
>> 5) S: Domain is allocated and sets the 'dev->dma_iommu' to 'true'.
>>
>> 4) above is the problematic step. Although issue exists but seems to me
>> that very narrow to get triggered. Please CMIW.
>
> Hmm, yes, I guess there is a fundamental race where async client driver
> probe can observe a partially initialised group while the IOMMU driver
> itself is still running iommu_device_register()->bus_iommu_probe()...
> And in that case, this patch is wrong for two reasons: firstly, bodging
> the iommu_probe_device() call (which really should not exist at all)
> does not cover all cases; and secondly, forcibly creating the default
> domain in that path before we know bus_iommu_probe() has seen all other
> existing devices means iommu_get_default_domain_type() may miss their
> requirements and thus defeat the whole point of deferred allocation in
> the first place.
>
Got it. Thanks.
> Having looked a bit closer, I think a more robust solution for now is
> probably as below.
>
Just want to hear of other solutions you have in mind, when you say this
solution is for now....
> Thanks,
> Robin.
>
> ----->8-----
> diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
> index d1af0547f553..8d90d196e38d 100644
> --- a/drivers/iommu/iommu.c
> +++ b/drivers/iommu/iommu.c
> @@ -3120,6 +3120,11 @@ int iommu_device_use_default_domain(struct device
> *dev)
> return 0;
>
> mutex_lock(&group->mutex);
> + /* We may race against bus_iommu_probe() finalising groups here */
> + if (!group->default_domain) {
> + ret = -EPROBE_DEFER;
> + goto unlock_out;
> + }
Please lmk If I can submit this as V2 patch, with your name as Suggested-by?
> if (group->owner_cnt) {
> if (group->domain != group->default_domain || group->owner ||
> !xa_empty(&group->pasid_array)) {
Powered by blists - more mailing lists