lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250630152612.npdobwbcezl5nlym@skbuf>
Date: Mon, 30 Jun 2025 18:26:12 +0300
From: Vladimir Oltean <vladimir.oltean@....com>
To: James Clark <james.clark@...aro.org>
Cc: Vladimir Oltean <olteanv@...il.com>, Mark Brown <broonie@...nel.org>,
	Arnd Bergmann <arnd@...db.de>,
	Larisa Grigore <larisa.grigore@....com>,
	Frank Li <Frank.li@....com>, Christoph Hellwig <hch@....de>,
	linux-spi@...r.kernel.org, imx@...ts.linux.dev,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 0/6] spi: spi-fsl-dspi: Target mode improvements

On Fri, Jun 27, 2025 at 11:21:36AM +0100, James Clark wrote:
> Improve usability of target mode by reporting FIFO errors and increasing
> the buffer size when DMA is used. While we're touching DMA stuff also
> switch to non-coherent memory, although this is unrelated to target
> mode.
> 
> The first commit is marked as a fix because it can fix intermittent
> issues with existing transfers, rather than the later fixes which
> improve larger than FIFO target mode transfers which would have never
> worked.
> 
> With the combination of the commit to increase the DMA buffer size and
> the commit to use non-coherent memory, the host mode performance figures
> are as follows on S32G3:
> 
>   # spidev_test --device /dev/spidev1.0 --bpw 8 --size <test_size> --cpha --iter 10000000 --speed 10000000
> 
>   Coherent (4096 byte transfers): 6534 kbps
>   Non-coherent:                   7347 kbps
> 
>   Coherent (16 byte transfers):    447 kbps
>   Non-coherent:                    448 kbps
> 
> Just for comparison running the same test in XSPI mode:
> 
>   4096 byte transfers:            2143 kbps
>   16 byte transfers:               637 kbps
> 
> These tests required hacking S32G3 to use DMA in host mode, although
> the figures should be representative of target mode too where DMA is
> used. And the other devices that use DMA in host mode should see similar
> improvements.
> 
> Signed-off-by: James Clark <james.clark@...aro.org>
> ---

My test numbers on LS1028A:

Baseline XSPI (unmodified driver):
$ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 8 --cpha --iter 10000000 --speed 10000000
rate: tx 2710.6kbps, rx 2710.6kbps
$ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 16 --cpha --iter 10000000 --speed 10000000
rate: tx 3217.5kbps, rx 3217.5kbps
$ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 4096 --cpha --iter 10000000 --speed 10000000
rate: tx 5118.4kbps, rx 5118.4kbps

Baseline DMA (modified just DSPI_XSPI_MODE -> DSPI_DMA_MODE):
$ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 8 --cpha --iter 10000000 --speed 10000000
rate: tx 1359.5kbps, rx 1359.5kbps
$ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 16 --cpha --iter 10000000 --speed 10000000
rate: tx 1461.1kbps, rx 1461.1kbps
$ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 4096 --cpha --iter 10000000 --speed 10000000
rate: tx 1664.6kbps, rx 1664.6kbps

Intermediary LS1028A DMA mode (using non-coherent buffers but still
small DMA buffers, i.e. holding just 1 FIFO size worth of data):
$ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 8 --cpha --iter 10000000 --speed 10000000
rate: tx 1345.1kbps, rx 1345.1kbps
$ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 16 --cpha --iter 10000000 --speed 10000000
rate: tx 1522.5kbps, rx 1522.5kbps
$ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 4096 --cpha --iter 10000000 --speed 10000000
rate: tx 1690.8kbps, rx 1690.8kbps

Final LS1028A DMA mode (with the patch to send large messages as a
single DMA buffer applied):
$ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 8 --cpha --iter 10000000 --speed 10000000
rate: tx 2247.0kbps, rx 2247.0kbps
$ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 16 --cpha --iter 10000000 --speed 10000000
rate: tx 3477.4kbps, rx 3477.4kbps
$ ./spidev_test --device /dev/spidev2.1 --bpw 8 --size 4096 --cpha --iter 10000000 --speed 10000000
rate: tx 8978.4kbps, rx 8978.4kbps

So after your patch set, DMA mode on LS1028A becomes more performant and
should replace XSPI. This is an outstanding result. That can be done as
follow-up work.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ