linux-kernel - Re: [PATCH 2/4] spi: spi-fsl-dspi: Use non-coherent memory for DMA

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <78d77dd0-2ddb-4074-8f2a-debc7bc41fe1@linaro.org>
Date: Thu, 12 Jun 2025 16:47:26 +0100
From: James Clark <james.clark@...aro.org>
To: Mark Brown <broonie@...nel.org>
Cc: Vladimir Oltean <vladimir.oltean@....com>, Arnd Bergmann <arnd@...db.de>,
 Frank Li <Frank.li@....com>, Vladimir Oltean <olteanv@...il.com>,
 linux-spi@...r.kernel.org, imx@...ts.linux.dev, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/4] spi: spi-fsl-dspi: Use non-coherent memory for DMA



On 12/06/2025 3:43 pm, Mark Brown wrote:
> On Thu, Jun 12, 2025 at 03:14:32PM +0100, James Clark wrote:
>> On 12/06/2025 12:15 pm, Vladimir Oltean wrote:
> 
>>> FWIW, the XSPI FIFO performance should be higher.
> 
>> This leads me to realise a mistake in my original figures. My head was stuck
>> in target mode where we use DMA so I forgot to force DMA in host mode to run
>> the performance tests. The previous figures were all XSPI mode and the small
>> difference in performance could have been just down to the layout of the
>> code changing?
> 
>> Changing it to DMA mode gives figures that make much more sense:
> 
> If not having DMA mode is making this much of a difference shouldn't the
> driver just do that?  I'm not seeing runtime configuration in the driver
> so I guess this is local hacking...
> 

Yes just changed locally.

>> So for small transfers XSPI is slightly better but for large ones DMA is
>> much better, with non-coherent memory giving another 800kbps gain. Perhaps
>> we could find the midpoint and then auto select the mode depending on the
>> size, but maybe there is latency to consider too which could be important.
> 
> This is a fairly normal pattern, it's a big part of why the can_dma()
> callback is per transfer - so you can do a copybreak and use PIO for
> smaller transfers where the overhead of setting up DMA is often more
> than the overhead of just doing PIO.

Makes sense. Although for some reason two devices already use DMA for 
host mode and it's not that clear to me what the reason is.