lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aTfNoU6fKBOcjL5j@ryzen>
Date: Tue, 9 Dec 2025 08:20:01 +0100
From: Niklas Cassel <cassel@...nel.org>
To: Frank Li <Frank.Li@....com>
Cc: Vinod Koul <vkoul@...nel.org>, Manivannan Sadhasivam <mani@...nel.org>,
	Krzysztof WilczyƄski <kwilczynski@...nel.org>,
	Kishon Vijay Abraham I <kishon@...nel.org>,
	Bjorn Helgaas <bhelgaas@...gle.com>, Christoph Hellwig <hch@....de>,
	Sagi Grimberg <sagi@...mberg.me>,
	Chaitanya Kulkarni <kch@...dia.com>,
	Herbert Xu <herbert@...dor.apana.org.au>,
	"David S. Miller" <davem@...emloft.net>,
	Nicolas Ferre <nicolas.ferre@...rochip.com>,
	Alexandre Belloni <alexandre.belloni@...tlin.com>,
	Claudiu Beznea <claudiu.beznea@...on.dev>,
	Koichiro Den <den@...inux.co.jp>, dmaengine@...r.kernel.org,
	linux-kernel@...r.kernel.org, linux-pci@...r.kernel.org,
	linux-nvme@...ts.infradead.org, mhi@...ts.linux.dev,
	linux-arm-msm@...r.kernel.org, linux-crypto@...r.kernel.org,
	linux-arm-kernel@...ts.infradead.org, imx@...ts.linux.dev
Subject: Re: [PATCH 0/8] dmaengine: Add new API to combine onfiguration and
 descriptor preparation

On Mon, Dec 08, 2025 at 12:09:39PM -0500, Frank Li wrote:
> Frank Li (8):
>       dmaengine: Add API to combine configuration and preparation (sg and single)
>       PCI: endpoint: pci-epf-test: use new DMA API to simple code
>       dmaengine: dw-edma: Use new .device_prep_slave_sg_config() callback
>       dmaengine: dw-edma: Pass dma_slave_config to dw_edma_device_transfer()
>       nvmet: pci-epf: Remove unnecessary dmaengine_terminate_sync() on each DMA transfer
>       nvmet: pci-epf: Use dmaengine_prep_slave_single_config() API
>       PCI: epf-mhi:Using new API dmaengine_prep_slave_single_config() to simple code.
>       crypto: atmel: Use dmaengine_prep_slave_single_config() API
> 
>  drivers/crypto/atmel-aes.c                    | 10 ++---
>  drivers/dma/dw-edma/dw-edma-core.c            | 38 +++++++++++-----
>  drivers/nvme/target/pci-epf.c                 | 21 +++------
>  drivers/pci/endpoint/functions/pci-epf-mhi.c  | 52 +++++++---------------
>  drivers/pci/endpoint/functions/pci-epf-test.c |  8 +---
>  include/linux/dmaengine.h                     | 64 ++++++++++++++++++++++++---
>  6 files changed, 111 insertions(+), 82 deletions(-)
> ---
> base-commit: bc04acf4aeca588496124a6cf54bfce3db327039
> change-id: 20251204-dma_prep_config-654170d245a2

For the series (tested using drivers/nvme/target/pci-epf.c):
Tested-by: Niklas Cassel <cassel@...nel.org>

Mainline:
  Rnd read,    4KB,  QD=1, 1 job :  IOPS=5721, BW=22.3MiB/s (23.4MB/s)
  Rnd read,    4KB, QD=32, 1 job :  IOPS=51.8k, BW=202MiB/s (212MB/s)
  Rnd read,    4KB, QD=32, 4 jobs:  IOPS=109k, BW=426MiB/s (447MB/s)
  Rnd read,  128KB,  QD=1, 1 job :  IOPS=2678, BW=335MiB/s (351MB/s)
  Rnd read,  128KB, QD=32, 1 job :  IOPS=19.1k, BW=2388MiB/s (2504MB/s)
  Rnd read,  128KB, QD=32, 4 jobs:  IOPS=18.1k, BW=2258MiB/s (2368MB/s)
  Rnd read,  512KB,  QD=1, 1 job :  IOPS=1388, BW=694MiB/s (728MB/s)
  Rnd read,  512KB, QD=32, 1 job :  IOPS=4554, BW=2277MiB/s (2388MB/s)
  Rnd read,  512KB, QD=32, 4 jobs:  IOPS=4516, BW=2258MiB/s (2368MB/s)
  Rnd write,   4KB,  QD=1, 1 job :  IOPS=4679, BW=18.3MiB/s (19.2MB/s)
  Rnd write,   4KB, QD=32, 1 job :  IOPS=35.1k, BW=137MiB/s (144MB/s)
  Rnd write,   4KB, QD=32, 4 jobs:  IOPS=33.7k, BW=132MiB/s (138MB/s)
  Rnd write, 128KB,  QD=1, 1 job :  IOPS=2490, BW=311MiB/s (326MB/s)
  Rnd write, 128KB, QD=32, 1 job :  IOPS=4964, BW=621MiB/s (651MB/s)
  Rnd write, 128KB, QD=32, 4 jobs:  IOPS=4966, BW=621MiB/s (651MB/s)
  Seq read,  128KB,  QD=1, 1 job :  IOPS=2586, BW=323MiB/s (339MB/s)
  Seq read,  128KB, QD=32, 1 job :  IOPS=17.5k, BW=2190MiB/s (2296MB/s)
  Seq read,  512KB,  QD=1, 1 job :  IOPS=1614, BW=807MiB/s (847MB/s)
  Seq read,  512KB, QD=32, 1 job :  IOPS=4540, BW=2270MiB/s (2381MB/s)
  Seq read,    1MB, QD=32, 1 job :  IOPS=2283, BW=2284MiB/s (2395MB/s)
  Seq write, 128KB,  QD=1, 1 job :  IOPS=2313, BW=289MiB/s (303MB/s)
  Seq write, 128KB, QD=32, 1 job :  IOPS=4948, BW=619MiB/s (649MB/s)
  Seq write, 512KB,  QD=1, 1 job :  IOPS=901, BW=451MiB/s (473MB/s)
  Seq write, 512KB, QD=32, 1 job :  IOPS=1289, BW=645MiB/s (676MB/s)
  Seq write,   1MB, QD=32, 1 job :  IOPS=632, BW=633MiB/s (663MB/s)
  Rnd rdwr, 4K..1MB, QD=8, 4 jobs:  IOPS=1756, BW=880MiB/s (923MB/s)
 IOPS=1767, BW=886MiB/s (929MB/s)


Mainline + this series applied:
  Rnd read,    4KB,  QD=1, 1 job :  IOPS=3681, BW=14.4MiB/s (15.1MB/s)
  Rnd read,    4KB, QD=32, 1 job :  IOPS=54.8k, BW=214MiB/s (224MB/s)
  Rnd read,    4KB, QD=32, 4 jobs:  IOPS=123k, BW=479MiB/s (502MB/s)
  Rnd read,  128KB,  QD=1, 1 job :  IOPS=2132, BW=267MiB/s (280MB/s)
  Rnd read,  128KB, QD=32, 1 job :  IOPS=19.0k, BW=2369MiB/s (2485MB/s)
  Rnd read,  128KB, QD=32, 4 jobs:  IOPS=18.7k, BW=2341MiB/s (2454MB/s)
  Rnd read,  512KB,  QD=1, 1 job :  IOPS=1135, BW=568MiB/s (595MB/s)
  Rnd read,  512KB, QD=32, 1 job :  IOPS=4546, BW=2273MiB/s (2384MB/s)
  Rnd read,  512KB, QD=32, 4 jobs:  IOPS=4708, BW=2354MiB/s (2469MB/s)
  Rnd write,   4KB,  QD=1, 1 job :  IOPS=3369, BW=13.2MiB/s (13.8MB/s)
  Rnd write,   4KB, QD=32, 1 job :  IOPS=31.7k, BW=124MiB/s (130MB/s)
  Rnd write,   4KB, QD=32, 4 jobs:  IOPS=31.1k, BW=122MiB/s (127MB/s)
  Rnd write, 128KB,  QD=1, 1 job :  IOPS=1820, BW=228MiB/s (239MB/s)
  Rnd write, 128KB, QD=32, 1 job :  IOPS=5703, BW=713MiB/s (748MB/s)
  Rnd write, 128KB, QD=32, 4 jobs:  IOPS=5813, BW=727MiB/s (762MB/s)
  Seq read,  128KB,  QD=1, 1 job :  IOPS=1958, BW=245MiB/s (257MB/s)
  Seq read,  128KB, QD=32, 1 job :  IOPS=18.8k, BW=2345MiB/s (2459MB/s)
  Seq read,  512KB,  QD=1, 1 job :  IOPS=1319, BW=660MiB/s (692MB/s)
  Seq read,  512KB, QD=32, 1 job :  IOPS=4542, BW=2271MiB/s (2382MB/s)
  Seq read,    1MB, QD=32, 1 job :  IOPS=2325, BW=2325MiB/s (2438MB/s)
  Seq write, 128KB,  QD=1, 1 job :  IOPS=2174, BW=272MiB/s (285MB/s)
  Seq write, 128KB, QD=32, 1 job :  IOPS=5697, BW=712MiB/s (747MB/s)
  Seq write, 512KB,  QD=1, 1 job :  IOPS=1035, BW=518MiB/s (543MB/s)
  Seq write, 512KB, QD=32, 1 job :  IOPS=1462, BW=731MiB/s (767MB/s)
  Seq write,   1MB, QD=32, 1 job :  IOPS=720, BW=721MiB/s (756MB/s)
  Rnd rdwr, 4K..1MB, QD=8, 4 jobs:  IOPS=2029, BW=1018MiB/s (1067MB/s)
 IOPS=2037, BW=1023MiB/s (1072MB/s)


Small performance boost, but I think the nicest thing with this series is
to be able to remove the ugly mutex in pci-epf.c.


Kind regards,
Niklas

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ