lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c65c752a-5b60-4f30-8d51-9a903ddd55a6@linaro.org>
Date: Thu, 12 Jun 2025 12:05:26 +0100
From: James Clark <james.clark@...aro.org>
To: Vladimir Oltean <vladimir.oltean@....com>, Arnd Bergmann <arnd@...db.de>,
 Frank Li <Frank.li@....com>
Cc: Vladimir Oltean <olteanv@...il.com>, Mark Brown <broonie@...nel.org>,
 linux-spi@...r.kernel.org, imx@...ts.linux.dev, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/4] spi: spi-fsl-dspi: Use non-coherent memory for DMA



On 11/06/2025 10:01 am, Vladimir Oltean wrote:
> On Tue, Jun 10, 2025 at 11:56:34AM -0400, Frank Li wrote:
>> Can you add performance beneafit information after use non-coherent memory
>> in commit message to let reviewer easily know your intention.
> 
> To expand on that, you can post the output of something like this
> (before and after):
> $ spidev_test --device /dev/spidev1.0 --bpw 8 --size 256 --cpha --iter 10000000 --speed 10000000
> where /dev/spidev1.0 is an unconnected chip select with a dummy entry in
> the device tree.

Coherent (before):

rate: tx 385.8kbps, rx 385.8kbps
rate: tx 1215.7kbps, rx 1215.7kbps
rate: tx 1845.2kbps, rx 1845.2kbps
rate: tx 1844.0kbps, rx 1844.0kbps
rate: tx 1846.1kbps, rx 1846.1kbps
rate: tx 1844.8kbps, rx 1844.8kbps
rate: tx 1844.4kbps, rx 1844.4kbps
rate: tx 1846.9kbps, rx 1846.9kbps
rate: tx 1846.5kbps, rx 1846.5kbps
rate: tx 1843.2kbps, rx 1843.2kbps
rate: tx 1844.8kbps, rx 1844.8kbps
rate: tx 1845.2kbps, rx 1845.2kbps
rate: tx 1846.5kbps, rx 1846.5kbps

Non-coherent (after):

rate: tx 314.6kbps, rx 314.6kbps
rate: tx 748.3kbps, rx 748.3kbps
rate: tx 1845.2kbps, rx 1845.2kbps
rate: tx 1849.3kbps, rx 1849.3kbps
rate: tx 1846.1kbps, rx 1846.1kbps
rate: tx 1847.3kbps, rx 1847.3kbps
rate: tx 1845.7kbps, rx 1845.7kbps
rate: tx 1846.5kbps, rx 1846.5kbps
rate: tx 1844.4kbps, rx 1844.4kbps
rate: tx 1847.3kbps, rx 1847.3kbps
rate: tx 1847.3kbps, rx 1847.3kbps
rate: tx 1845.7kbps, rx 1845.7kbps
rate: tx 1846.5kbps, rx 1846.5kbps

Ignoring anything less than 1800 as starting up, coherent has an average 
of 1845.2kbps and non-coherent 1846.5kbps. Not sure if that's just noise 
or an actual effect.

With stress running in the background the difference in average over 17 
runs is slightly more significant:

   stress -m 8 --vm-stride 1 --vm-bytes 64MB

Coherent: 2105.5kbps
Non-coherent: 2125.6kbps

There's not much variance in the runs either, they're pretty much always 
2105 and 2125 +-1 so I don't think this result is noise.

(No idea why it goes faster when it's under load, but I hope that can be 
ignored for this test)


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ