[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <90ea3460-a715-47b6-a151-181e542512e9@huaweicloud.com>
Date: Mon, 6 Nov 2023 13:46:35 +0100
From: Petr Tesarik <petrtesarik@...weicloud.com>
To: Christoph Hellwig <hch@....de>,
Petr Tesařík <petr@...arici.cz>
Cc: Niklas Schnelle <schnelle@...ux.ibm.com>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Marek Szyprowski <m.szyprowski@...sung.com>,
Robin Murphy <robin.murphy@....com>,
Ross Lagerwall <ross.lagerwall@...rix.com>,
linux-pci <linux-pci@...r.kernel.org>,
linux-kernel@...r.kernel.org, iommu@...ts.linux.dev,
Matthew Rosato <mjrosato@...ux.ibm.com>,
Halil Pasic <pasic@...ux.ibm.com>
Subject: Re: Memory corruption with CONFIG_SWIOTLB_DYNAMIC=y
Hi Christoph,
On 11/6/2023 8:44 AM, Christoph Hellwig wrote:
> On Fri, Nov 03, 2023 at 07:59:49PM +0100, Petr Tesařík wrote:
>> I don't think it's possible to improve the allocation logic without
>> modifying the page allocator and/or the DMA atomic pool allocator to
>> take additional constraints into account.
>>
>> I had a wild idea back in March, but it would require some intrusive
>> changes in the mm subsystem. Among other things, it would make memory
>> zones obsolete. I mean, people may actually like to get rid of DMA,
>> DMA32 and NORMAL, but you see how many nasty bugs were introduced even
>> by a relatively small change in SWIOTLB. Replacing memory zones with a
>> system based on generic physical allocation constraints would probably
>> blow up the universe. ;-)
>
> It would be very nice, at least for DMA32 or the 30/31-bit DMA pools
> used on some architectures. For the x86-style 16MB zone DMA I suspect
> just having a small pool on the side that's not even exposed to the
> memory allocator would probably work better.
>
> I think a lot of the MM folks would love to be able to kill of the
> extra zones.
There's more to it. If you look at DMA buffer allocations, they need
memory which is contiguous in DMA address space of the requesting
device, but we allocate buffers that are contiguous in physical address
of the CPU. This difference is in fact responsible for some of the odd
DMA address limits.
All hell breaks loose when you try to fix this properly. Instead, we get
away with the observation that physically contiguous memory regions
coincide with DMA contiguous regions on real-world systems. But if
anyone feels like starting from scratch, they could also take the extra
time to look at this part. ;-)
FWIW I'm not volunteering, or at least not this year.
Petr T
Powered by blists - more mailing lists