lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <cb2db54e-37e8-495d-916a-4e63c67019bb@arm.com>
Date: Wed, 8 Jan 2025 12:37:30 +0000
From: Robin Murphy <robin.murphy@....com>
To: Will Deacon <will@...nel.org>,
 Charan Teja Kalla <quic_charante@...cinc.com>
Cc: joro@...tes.org, jgg@...pe.ca, iommu@...ts.linux.dev,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH RFC] iommu: fix wrong DMA ops for iommu device

On 07/01/2025 11:31 am, Will Deacon wrote:
> On Mon, Jan 06, 2025 at 06:46:08PM +0530, Charan Teja Kalla wrote:
>> On 1/3/2025 9:04 PM, Will Deacon wrote:
>>> On Thu, Jan 02, 2025 at 04:09:46PM +0530, Charan Teja Kalla wrote:
>>>> On 12/13/2024 8:34 PM, Charan Teja Kalla wrote:
>>>>> Race between smmu device probe and iommu device probe is resulting into
>>>>> wrong DMA ops used for the device.
>>>>>
>>>>> bus_iommu_probe(P1)                  of_iommu_configure(P2)
>>>>> (from iommu_device_register)        (from insmod of a driver)
>>>>> --------------------------         --------------------------
>>>>> 1) probe_iommu_group()
>>>>> (Fills just dev->iommu_group
>>>>> under iommu_probe_device_lock)
>>>>> 				2) iommu_probe_device():
>>>>> 				  (acquires the iommu_probe_device_lock,
>>>>> 				   but since dev->iommu_group is already
>>>>> 				   set, it just returns without
>>>>> 				   allocating the domain.)
>>>>>
>>>>> 				3) of_dma_configure_id()->
>>>>> 				   arch_setup_dma_ops() gets called for
>>>>> 				   this device, but returns with the
>>>>> 				   error message:
>>>>> 				   "Failed to set up IOMMU for device;
>>>>> 				    retaining platform DMA ops"
>>>>>
>>>>> 				4) device probe is succeeded where it
>>>>> 				   can now setup iommu mappings using
>>>>> 				   wrong ->dma_ops
>>>>>
>>>>> 5) domain will be allocated later
>>>>> and fills iommu_group->domain.
>>>>>
>>>>> Step 3) and 4) denies iommu device from using 'iommu_dma_ops'.
>>>>>
>>>>> This issue does exists on 6.6 because of commit f188056352bc ("iommu:
>>>>> Avoid locking/unlocking for iommu_probe_device()") without which 2)
>>>>> returns only after filling the domain under group->mutex, thus no issue.
>>>>>
>>>>> With commit b5c58b2fdc42("dma-mapping: direct calls for dma-iommu"),
>>>>> ->iommu_dma_ops is removed altogether and iommu api are directly being
>>>>> called under "use_dma_iommu(): return dev->dma_iommu". Again,
>>>>> ->dma_iommu flag is set only after domain allocation. So, If
>>>>> there is a sufficient delay between steps 4) and 5), issue is still
>>>>> exists but that seems very narrow to me.
>>>>>
>>>>> I am thinking, setup of the domain before returning from step2) is the
>>>>> way to fix this issue. Can you please help if you have any suggestions?
>>>>>
>>>>> Signed-off-by: Charan Teja Kalla <quic_charante@...cinc.com>
>>>>> ---
>>>>>   drivers/iommu/iommu.c | 26 +++++++++++++++++++++++++-
>>>>>   1 file changed, 25 insertions(+), 1 deletion(-)
>>>>
>>>> Just a quick ask if you have any comments for the below. BTW, in
>>>> internal tests, when we revert the commit f188056352bc ("iommu: Avoid
>>>> locking/unlocking for iommu_probe_device()", it is helping.
>>>>
>>>> I can share more details If the below is not sufficient.
>>>>
>>>> Thanks in advance...
>>>
>>> Can you reproduce the issue on 6.13-rc5? We bodged some of the probe
>>> code recently:
>>>
>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=229e6ee43d2a160a1592b83aad620d6027084aad
>>>
>>> and a better set of fixes are queued in -next.
>>
>> Yes, we did try a better patch from Robin[1] and the issue is still
>> reproducible.
> 
> Did you try the -EPROBE_DEFER hack that got merged?
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=229e6ee43d2a160a1592b83aad620d6027084aad
> 
> Otherwise, you can try the stack queued for 6.14 in iommu/linux.git:
> 
> 46b3df8eb9bd iommu: Manage driver probe deferral better
> fcbd62156742 iommu/arm-smmu-v3: Clean up more on probe failure
> 97cb1fa02726 iommu/arm-smmu: Retire probe deferral workaround
> 7d835134d4e1 iommu/arm-smmu: Make instance lookup robust
> 
>> IIUC, these patches are trying to get valid smmu device, during the
>> probe of an smmu client device, by traversing proper 'klist_devices'
>> happening from iommu_init_device(). Please CMIW.
> 
> Well, they're fixing a race between the IOMMU probing and its client
> devices attaching.
> 
>> But this bug makes iommu_init_device() to completely gets skipped.
>>
>> static int __iommu_probe_device(struct device *dev, ...)
>> {
>> 	/* Device is probed already if in a group */
>>          if (dev->iommu_group)
>>                  return 0;  --> driver call returns from here.
>>
>>          ret = iommu_init_device(dev, ops);
>>          if (ret)
>>                  return ret;
>>
>>
>> }
>>
>> Regarding the testing, we work on Android that is on LTS kernels, so, it
>> may not be possible for us to try the thing on 6.13 and this seems a bug
>> on LTS 6.6 kernel.
> 
> We need to fix 6.13 before we fix 6.6 unless you can show that 6.13 is
> unaffected (in which case, we can backport the fix(es)).

Certainly the reasoning can't apply to mainline as given, since 
arch_setup_dma_ops() stopped touching IOMMU stuff at all back in 6.10 
(and indeed iommu_dma_ops itself no longer exists since 6.12).

Robin.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ