[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <ecf3d8db-8533-4b0d-ab49-3543a766fd99@broadcom.com>
Date: Wed, 15 Jan 2025 14:16:11 -0800
From: Florian Fainelli <florian.fainelli@...adcom.com>
To: Robin Murphy <robin.murphy@....com>, linux-kernel@...r.kernel.org
Cc: bcm-kernel-feedback-list@...adcom.com,
Justin Chen <justin.chen@...adcom.com>, Jonathan Corbet <corbet@....net>,
Christoph Hellwig <hch@....de>, Marek Szyprowski <m.szyprowski@...sung.com>,
Steven Rostedt <rostedt@...dmis.org>, Masami Hiramatsu
<mhiramat@...nel.org>, Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Petr Tesarik <petr@...arici.cz>, Michael Kelley <mhklinux@...look.com>,
Lukas Bulwahn <lukas.bulwahn@...hat.com>,
Bagas Sanjaya <bagasdotme@...il.com>, Eder Zulian <ezulian@...hat.com>,
Sean Anderson <sean.anderson@...ux.dev>,
"open list:DOCUMENTATION" <linux-doc@...r.kernel.org>,
"open list:DMA MAPPING HELPERS" <iommu@...ts.linux.dev>,
"open list:TRACING" <linux-trace-kernel@...r.kernel.org>
Subject: Re: [PATCH] swiotlb: Introduce DMA_ATTR_SKIP_DEVICE_SYNC
On 1/15/25 12:12, Robin Murphy wrote:
> On 2025-01-15 7:46 pm, Florian Fainelli wrote:
>> From: Justin Chen <justin.chen@...adcom.com>
>>
>> Network device driver's receive path typically do the following:
>>
>> - dma_map_single(.., DMA_FROM_DEVICE)
>> - dma_sync_single_for_cpu() to allow the CPU to inspect packet
>> descriptors
>> - dma_unmap_single(.., DMA_FROM_DEVICE) when releasing the buffer
>>
>> Each of those operations incurs a copy from the original buffer to the
>> TLB buffer, even if the device is known to be writing full buffers.
>>
>> Add a DMA_ATTR_SKIP_DEVICE_SYNC flag which can be set by device drivers
>> to skip the copy at dma_map_single() to speed up the RX path when the
>> device is known to be doing full buffer writes.
>>
>> This has been seen to provide a 20% speedup for Wi-Fi RX throughput
>> testing.
>>
>> Signed-off-by: Justin Chen <justin.chen@...adcom.com>
>> [florian: commit message, add DMA-API attribute flag]
>> Signed-off-by: Florian Fainelli <florian.fainelli@...adcom.com>
>> ---
>> Documentation/core-api/dma-attributes.rst | 9 +++++++++
>> Documentation/core-api/swiotlb.rst | 4 +++-
>> include/linux/dma-mapping.h | 6 ++++++
>> include/trace/events/dma.h | 3 ++-
>> kernel/dma/swiotlb.c | 8 ++++++++
>> 5 files changed, 28 insertions(+), 2 deletions(-)
>>
>> diff --git a/Documentation/core-api/dma-attributes.rst b/
>> Documentation/core-api/dma-attributes.rst
>> index 1887d92e8e92..ccd9c1891200 100644
>> --- a/Documentation/core-api/dma-attributes.rst
>> +++ b/Documentation/core-api/dma-attributes.rst
>> @@ -130,3 +130,12 @@ accesses to DMA buffers in both privileged
>> "supervisor" and unprivileged
>> subsystem that the buffer is fully accessible at the elevated privilege
>> level (and ideally inaccessible or at least read-only at the
>> lesser-privileged levels).
>> +
>> +DMA_ATTR_SKIP_DEVICE_SYNC
>> +-------------------------
>> +
>> +Device drivers can set DMA_ATTR_SKIP_DEVICE_SYNC in order to avoid
>> doing a copy
>> +from the original buffer to the TLB buffer for dma_map_single() with a
>> +DMA_FROM_DEVICE direction. This can be used to save an extra copy in
>> a device
>> +driver's data path when using swiotlb bounce buffering.
>> +
>> diff --git a/Documentation/core-api/swiotlb.rst b/Documentation/core-
>> api/swiotlb.rst
>> index 9e0fe027dd3b..3bc1f9ba67b2 100644
>> --- a/Documentation/core-api/swiotlb.rst
>> +++ b/Documentation/core-api/swiotlb.rst
>> @@ -67,7 +67,9 @@ to the driver for programming into the device. If a
>> DMA operation specifies
>> multiple memory buffer segments, a separate bounce buffer must be
>> allocated for
>> each segment. swiotlb_tbl_map_single() always does a "sync"
>> operation (i.e., a
>> CPU copy) to initialize the bounce buffer to match the contents of
>> the original
>> -buffer.
>> +buffer, except if DMA_ATTR_SKIP_DEVICE_SYNC is specified and the
>> direction is
>> +DMA_FROM_DEVICE. This is a performance optimization that may not be
>> suitable for
>> +all platforms.
>> swiotlb_tbl_unmap_single() does the reverse. If the DMA operation
>> might have
>> updated the bounce buffer memory and DMA_ATTR_SKIP_CPU_SYNC is not
>> set, the
>> diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
>> index b79925b1c433..bfdaa65f8e9d 100644
>> --- a/include/linux/dma-mapping.h
>> +++ b/include/linux/dma-mapping.h
>> @@ -58,6 +58,12 @@
>> */
>> #define DMA_ATTR_PRIVILEGED (1UL << 9)
>> +/*
>> + * DMA_ATTR_SKIP_DEVICE_SYNC: used to indicate that the buffer does
>> not need to
>> + * be synchronized to the device.
>> + */
>> +#define DMA_ATTR_SKIP_DEVICE_SYNC (1UL << 10)
>> +
>> /*
>> * A dma_addr_t can hold any valid DMA or bus address for the
>> platform. It can
>> * be given to a device to use as a DMA source or target. It is
>> specific to a
>> diff --git a/include/trace/events/dma.h b/include/trace/events/dma.h
>> index d8ddc27b6a7c..6eb8fd7e3515 100644
>> --- a/include/trace/events/dma.h
>> +++ b/include/trace/events/dma.h
>> @@ -31,7 +31,8 @@ TRACE_DEFINE_ENUM(DMA_NONE);
>> { DMA_ATTR_FORCE_CONTIGUOUS, "FORCE_CONTIGUOUS" }, \
>> { DMA_ATTR_ALLOC_SINGLE_PAGES, "ALLOC_SINGLE_PAGES" }, \
>> { DMA_ATTR_NO_WARN, "NO_WARN" }, \
>> - { DMA_ATTR_PRIVILEGED, "PRIVILEGED" })
>> + { DMA_ATTR_PRIVILEGED, "PRIVILEGED" }, \
>> + { DMA_ATTR_SKIP_DEVICE_SYNC, "SKIP_DEVICE_SYNC" })
>> DECLARE_EVENT_CLASS(dma_map,
>> TP_PROTO(struct device *dev, phys_addr_t phys_addr, dma_addr_t
>> dma_addr,
>> diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
>> index abcf3fa63a56..8dab89bf5e33 100644
>> --- a/kernel/dma/swiotlb.c
>> +++ b/kernel/dma/swiotlb.c
>> @@ -1435,8 +1435,16 @@ phys_addr_t swiotlb_tbl_map_single(struct
>> device *dev, phys_addr_t orig_addr,
>> * the original data, even if it's garbage, is necessary to match
>> * hardware behavior. Use of swiotlb is supposed to be
>> transparent,
>> * i.e. swiotlb must not corrupt memory by clobbering unwritten
>> bytes.
>> + *
>> + * Setting DMA_ATTR_SKIP_DEVICE_SYNC will negate the behavior
>> described
>> + * before and avoid the copy from the original buffer to the TLB
>> + * buffer.
>> */
>> + if (dir == DMA_FROM_DEVICE && (attrs & DMA_ATTR_SKIP_DEVICE_SYNC))
>> + goto out;
>
> Nope, we still need to initialise the SWIOTLB slot with *something*, or
> we're just reintroducing the same data leakage issue again. The whole
> deal with that was that the caller *did* expect the entire buffer to be
> written, but the device had an error, and thus the subsequent unmap
> bounced out whatever data was in SWIOTLB before.
Do you know how common that assumption is within drivers? Could that
behavior been hidden behind a flag, sort of like what is being done here?
>
> A memset is hopefully at least a bit faster than a copy, so maybe there
> is still some life in this idea, but the semantic is not "skip sync",
> it's "I'm OK with this entire buffer getting scribbled even my device
> ends up never touching it".
Sure, let me test with a memset(), any suggestion on a name that carries
better semantics, I do agree that "skip sync" is not quite descriptive
enough.
--
Florian
Powered by blists - more mailing lists