[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230328040724.GB25506@lst.de>
Date: Tue, 28 Mar 2023 06:07:24 +0200
From: Christoph Hellwig <hch@....de>
To: Petr Tesarik <petrtesarik@...weicloud.com>
Cc: Jonathan Corbet <corbet@....net>, Christoph Hellwig <hch@....de>,
Marek Szyprowski <m.szyprowski@...sung.com>,
Robin Murphy <robin.murphy@....com>,
Borislav Petkov <bp@...e.de>,
"Paul E. McKenney" <paulmck@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Randy Dunlap <rdunlap@...radead.org>,
Damien Le Moal <damien.lemoal@...nsource.wdc.com>,
Kim Phillips <kim.phillips@....com>,
"Steven Rostedt (Google)" <rostedt@...dmis.org>,
"open list:DOCUMENTATION" <linux-doc@...r.kernel.org>,
open list <linux-kernel@...r.kernel.org>,
"open list:DMA MAPPING HELPERS" <iommu@...ts.linux.dev>,
Roberto Sassu <roberto.sassu@...wei.com>, petr@...arici.cz,
Alexander Graf <graf@...zon.com>
Subject: Re: [RFC v1 3/4] swiotlb: Allow dynamic allocation of bounce
buffers
[adding Alex as he has been interested in this in the past]
On Mon, Mar 20, 2023 at 01:28:15PM +0100, Petr Tesarik wrote:
> Second, on the Raspberry Pi 4, swiotlb is used by dma-buf for pages
> moved from the rendering GPU (v3d driver), which can access all
> memory, to the display output (vc4 driver), which is connected to a
> bus with an address limit of 1 GiB and no IOMMU. These buffers can
> be large (several megabytes) and cannot be handled by SWIOTLB,
> because they exceed maximum segment size of 256 KiB. Such mapping
> failures can be easily reproduced on a Raspberry Pi4: Starting
> GNOME remote desktop results in a flood of kernel messages like
> these:
Shouldn't we make sure dma-buf allocates the buffers for the most
restricted devices, and more importantly does something like a dma
coherent allocation instead of a dynamic mapping of random memory?
While a larger swiotlb works around this I don't think this fixes the root
cause.
> 1. The value is limited to ULONG_MAX, which is too little both for
> physical addresses (e.g. x86 PAE or 32-bit ARM LPAE) and DMA
> addresses (e.g. Xen guests on 32-bit ARM).
>
> 2. Since buffers are currently allocated with page granularity, a
> PFN can be used instead. However, some values are reserved by
> the maple tree implementation. Liam suggests to use
> xa_mk_value() in that case, but that reduces the usable range by
> half. Luckily, 31 bits are still enough to hold a PFN on all
> 32-bit platforms.
>
> 3. Software IO TLB is used from interrupt context. The maple tree
> implementation is not IRQ-safe (MT_FLAGS_LOCK_IRQ does nothing
> AFAICS). Instead, I use an external lock, spin_lock_irqsave() and
> spin_unlock_irqrestore().
>
> Note that bounce buffers are never allocated dynamically if the
> software IO TLB is in fact a DMA restricted pool, which is intended
> to be stay in its designated location in physical memory.
I'm a little worried about all that because it causes quite a bit
of overhead even for callers that don't end up going into the
dynamic range or do not use swiotlb at all. I don't really have a
good answer here except for the usual avoid bounce buffering whenever
you can that might not always be easy to do.
> + gfp = (attrs & DMA_ATTR_MAY_SLEEP) ? GFP_KERNEL : GFP_NOWAIT;
> + slot = kmalloc(sizeof(*slot), gfp | __GFP_NOWARN);
> + if (!slot)
> + goto err;
> +
> + slot->orig_addr = orig_addr;
> + slot->alloc_size = alloc_size;
> + slot->page = dma_direct_alloc_pages(dev, PAGE_ALIGN(alloc_size),
> + &slot->dma_addr, dir,
> + gfp | __GFP_NOWARN);
> + if (!slot->page)
> + goto err_free_slot;
Without GFP_NOIO allocations this will deadlock eventually.
Powered by blists - more mailing lists