lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 31 May 2022 17:01:46 +0100
From:   Robin Murphy <robin.murphy@....com>
To:     Jason Gunthorpe <jgg@...dia.com>
Cc:     Baolu Lu <baolu.lu@...ux.intel.com>,
        Joerg Roedel <joro@...tes.org>,
        Kevin Tian <kevin.tian@...el.com>,
        Ashok Raj <ashok.raj@...el.com>,
        Christoph Hellwig <hch@...radead.org>,
        Will Deacon <will@...nel.org>, Liu Yi L <yi.l.liu@...el.com>,
        Jacob jun Pan <jacob.jun.pan@...el.com>,
        iommu@...ts.linux-foundation.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 01/12] iommu/vt-d: Use iommu_get_domain_for_dev() in
 debugfs

On 2022-05-31 16:13, Jason Gunthorpe wrote:
> On Tue, May 31, 2022 at 04:01:04PM +0100, Robin Murphy wrote:
>> On 2022-05-31 15:53, Jason Gunthorpe wrote:
>>> On Tue, May 31, 2022 at 10:11:18PM +0800, Baolu Lu wrote:
>>>> On 2022/5/31 21:10, Jason Gunthorpe wrote:
>>>>> On Tue, May 31, 2022 at 11:02:06AM +0800, Baolu Lu wrote:
>>>>>
>>>>>> For case 2, it is a bit weird. I tried to add a rwsem lock to make the
>>>>>> iommu_unmap() and dumping tables in debugfs exclusive. This does not
>>>>>> work because debugfs may depend on the DMA of the devices to work. It
>>>>>> seems that what we can do is to allow this race, but when we traverse
>>>>>> the page table in debugfs, we will check the validity of the physical
>>>>>> address retrieved from the page table entry. Then, the worst case is to
>>>>>> print some useless information.
>>>>>
>>>>> Sounds horrible, don't you have locking around the IOPTEs of some
>>>>> kind? How does updating them work reliably?
>>>>
>>>> There's no locking around updating the IOPTEs. The basic assumption is
>>>> that at any time, there's only a single thread manipulating the mappings
>>>> of the range specified in iommu_map/unmap() APIs. Therefore, the race
>>>> only exists when multiple ranges share some high-level IOPTEs. The IOMMU
>>>> driver updates those IOPTEs using the compare-and-exchange atomic
>>>> operation.
>>>
>>> Oh? Did I miss where that was documented as part of the iommu API?
>>>
>>> Daniel posted patches for VFIO to multi-thread iommu_domin mapping.
>>>
>>> iommufd goes out of its way to avoid this kind of serialization so
>>> that userspace can parallel map IOVA.
>>>
>>> I think if this is the requirement then the iommu API needs to
>>> provide a lock around the domain for the driver..
>>
>> Eww, no, we can't kill performance by forcing serialisation on the entire
>> API just for one silly driver-internal debugfs corner :(
> 
> I'm not worried about debugfs, I'm worried about these efforts to
> speed up VFIO VM booting by parallel domain loading:
> 
> https://lore.kernel.org/kvm/20220106004656.126790-1-daniel.m.jordan@oracle.com/
> 
> The DMA API should maintain its own external lock, but the general
> domain interface to the rest of the kernel should be safe, IMHO.

The DMA API doesn't need locking, partly since it can trust itself not 
to do stupid things, and mostly because it's DMA API performance that's 
fundamentally incompatible with serialisation anyway. Why do you think 
we have a complicated per-CPU IOVA caching mechanism, if not to support 
big multi-queue devices with multiple CPU threads mapping/unmapping in 
different parts of the same DMA domain concurrently?

> Or at least it should be documented..

As far as I'm aware there's never been any restriction at the IOMMU API 
level. It should be self-evident that racing concurrent 
iommu_{map,unmap} requests against iommu_domain_free(), or against each 
other for overlapping IOVAs, is a bad idea and don't do that if you want 
a well-defined outcome. The simpler drivers already serialise on a 
per-domain lock internally, while the more performance-focused ones 
implement lock-free atomic pagetable management in a similar style to 
CPU arch code; either way it should work fine as-is. The difference with 
debugfs is that it's a completely orthogonal side-channel - an 
iommu_domain user like VFIO or iommu-dma can make sure its *own* API 
usage is sane, but can't be aware of the user triggering some 
driver-internal introspection of that domain in a manner that could race 
more harmfully. It has to be down to individual drivers to make that 
safe if they choose to expose such an interface.

Robin.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ