linux-kernel - Re: [PATCH 2/4] spi: spi-fsl-dspi: Use non-coherent memory for DMA

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <56b158fa-5fd6-4582-8ca1-296863d6d2a8@sirena.org.uk>
Date: Thu, 12 Jun 2025 15:43:40 +0100
From: Mark Brown <broonie@...nel.org>
To: James Clark <james.clark@...aro.org>
Cc: Vladimir Oltean <vladimir.oltean@....com>,
	Arnd Bergmann <arnd@...db.de>, Frank Li <Frank.li@....com>,
	Vladimir Oltean <olteanv@...il.com>, linux-spi@...r.kernel.org,
	imx@...ts.linux.dev, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/4] spi: spi-fsl-dspi: Use non-coherent memory for DMA

On Thu, Jun 12, 2025 at 03:14:32PM +0100, James Clark wrote:
> On 12/06/2025 12:15 pm, Vladimir Oltean wrote:

> > FWIW, the XSPI FIFO performance should be higher.

> This leads me to realise a mistake in my original figures. My head was stuck
> in target mode where we use DMA so I forgot to force DMA in host mode to run
> the performance tests. The previous figures were all XSPI mode and the small
> difference in performance could have been just down to the layout of the
> code changing?

> Changing it to DMA mode gives figures that make much more sense:

If not having DMA mode is making this much of a difference shouldn't the
driver just do that?  I'm not seeing runtime configuration in the driver
so I guess this is local hacking...

> So for small transfers XSPI is slightly better but for large ones DMA is
> much better, with non-coherent memory giving another 800kbps gain. Perhaps
> we could find the midpoint and then auto select the mode depending on the
> size, but maybe there is latency to consider too which could be important.

This is a fairly normal pattern, it's a big part of why the can_dma()
callback is per transfer - so you can do a copybreak and use PIO for
smaller transfers where the overhead of setting up DMA is often more
than the overhead of just doing PIO.

Download attachment "signature.asc" of type "application/pgp-signature" (489 bytes)