lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251027004331.562345-1-den@valinux.co.jp>
Date: Mon, 27 Oct 2025 09:43:27 +0900
From: Koichiro Den <den@...inux.co.jp>
To: ntb@...ts.linux.dev,
	linux-kernel@...r.kernel.org
Cc: jdmason@...zu.us,
	dave.jiang@...el.com,
	allenbh@...il.com
Subject: [PATCH 0/4] NTB: ntb_transport: DMA fixes and scalability improvements

This series contains two DMA-related fixes (Patch #1-2) and two scalability
improvements (Patch #3-4) for ntb_transport. Behavior remains unchanged
unless new module parameters are explicitly set.

New module parameters
=====================

  - use_tx_dma : Enable TX DMA independently (default: 0)
  - use_rx_dma : Enable RX DMA independently (default: 0)
  - num_tx_dma_chan : # of TX DMA channels per queue (default: 1)
  - num_rx_dma_chan : # of RX DMA channels per queue (default: 1)

  Note: legacy 'use_dma' switch is kept and prioritized higher.
        Enabling it always implies use_tx_dma=1 and use_rx_dma=1
	regardless of whether use_(tx|rx)_dma=0 is appended.

Performance measurement
=======================

Tested on R-Car S4. With the following patchsets applied [1]:

  - [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets
    (https://lore.kernel.org/all/20251023071916.901355-1-den@valinux.co.jp/)
  - [PATCH 0/2] Add 'tx_memcpy_offload' option to ntb_transport
    (https://lore.kernel.org/all/20251023072105.901707-1-den@valinux.co.jp/)

throughput became bound by RX DMA service rate. Increasing the number of
RX DMA channels (>1) improved throughput substantially:

  - use_rx_dma=1 num_rx_dma_chan=1
                 ^^^^^^^^^^^^^^^^^
    (full command: $ sudo modprobe ntb_transport tx_memcpy_offload=1 use_rx_dma=1 num_rx_dma_chan=1 use_intr=1)

    $ sudo sockperf tp -i $SERVER_IP -m 65400 -t 10 # RX DMA n_chan=1
    sockperf: == version #3.10-no.git == 
    [...]
    sockperf: Summary: Message Rate is 8636 [msg/sec], Packet Rate is about 388620 [pkt/sec] (45 ip frags / msg)
    sockperf: Summary: BandWidth is 538.630 MBps (4309.039 Mbps)
                                                  ^^^^^^^^^^^^^

  - use_rx_dma=1 num_rx_dma_chan=2
                 ^^^^^^^^^^^^^^^^^
    (full command: $ sudo modprobe ntb_transport tx_memcpy_offload=1 use_rx_dma=1 num_rx_dma_chan=1 use_intr=1)

    $ sudo sockperf tp -i $SERVER_IP -m 65400 -t 10 # RX DMA n_chan=2
    sockperf: == version #3.10-no.git == 
    [...]
    sockperf: Summary: Message Rate is 14283 [msg/sec], Packet Rate is about 642735 [pkt/sec] (45 ip frags / msg)
    sockperf: Summary: BandWidth is 890.835 MBps (7126.680 Mbps)
                                                  ^^^^^^^^^^^^^

[1] Additional changes are required to use DMA on R-Car S4. Those will be
    posted separately.


Koichiro Den (4):
  NTB: ntb_transport: Handle remapped contiguous region in vmalloc space
  NTB: ntb_transport: Ack DMA memcpy descriptors to avoid wait-list
    growth
  NTB: ntb_transport: Add module parameters use_tx_dma/use_rx_dma
  NTB: ntb_transport: Support multi-channel DMA via module parameters

 drivers/ntb/ntb_transport.c | 386 +++++++++++++++++++++++++-----------
 1 file changed, 270 insertions(+), 116 deletions(-)

-- 
2.48.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ