lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251023071916.901355-1-den@valinux.co.jp>
Date: Thu, 23 Oct 2025 16:18:51 +0900
From: Koichiro Den <den@...inux.co.jp>
To: ntb@...ts.linux.dev,
	linux-pci@...r.kernel.org,
	dmaengine@...r.kernel.org,
	linux-kernel@...r.kernel.org
Cc: mani@...nel.org,
	kwilczynski@...nel.org,
	kishon@...nel.org,
	bhelgaas@...gle.com,
	corbet@....net,
	vkoul@...nel.org,
	jdmason@...zu.us,
	dave.jiang@...el.com,
	allenbh@...il.com,
	Basavaraj.Natikar@....com,
	Shyam-sundar.S-k@....com,
	kurt.schwemmer@...rosemi.com,
	logang@...tatee.com,
	jingoohan1@...il.com,
	lpieralisi@...nel.org,
	robh@...nel.org,
	jbrunet@...libre.com,
	Frank.Li@....com,
	fancer.lancer@...il.com,
	arnd@...db.de,
	pstanner@...hat.com,
	elfring@...rs.sourceforge.net
Subject: [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets

Hi all,

Motivation
==========

On Renesas R-Car S4 the PCIe Endpoint is DesignWare-based and the platform
does not allow mapping GITS_TRANSLATER as an inbound iATU target. As a
result, forwarding MSI writes from the Root Complex (RC) to the Endpoint
(EP) is not possible even if we would add implementation to create a MSI
domain for the vNTB device to use existing drivers/ntb/msi.c, and NTB
traffic must fall back to doorbells (polling). In addition, BAR resources
are scarce, which makes it difficult to dedicate a BAR solely to an
NTB/msi window.

This RFC introduces a generic interrupt backend for NTB. The existing MSI
path is converted to a backend, and a new DW eDMA test-interrupt backend
provides an RC-to-EP interrupt fallback when MSI cannot be used. In
parallel, EPC/DWC gains inbound subrange mapping so multiple NTB memory
windows (MWs) can share a single BAR at arbitrary offsets (via mwN_offset).
The vNTB EPF and ntb_transport are taught about offsets.

Backend selection is automatic: if MSI is available we use the MSI backend.
Otherwise, if enabled, the DW eDMA backend is used. If neither is
available, we continue to use doorbells. Existing systems remain unaffected
unless use_intr=1 is set.

Example layout (R-Car S4):

  BAR0: Config/Spad
  BAR2 [0x00000-0xF0000]: MW1 (data)
  BAR2 [0xF0000-0xF8000]: MW2 (interrupts)
  BAR4: Doorbell

  # The corresponding configfs settings (see Patch #25):
  echo 0xF0000 > ./mw1
  echo 0x8000  > ./mw2
  echo 0xF0000 > ./mw2_offset
  echo 2       > ./mw1_bar
  echo 2       > ./mw2_bar

Summary of changes
==================

* NTB core/transport
  - Introduce struct ntb_intr_backend and convert MSI to the new backend.
  - Add DW eDMA interrupt backend (CONFIG_NTB_DW_EDMA) as MSI-less fallback.
  - Rename module parameter to use_intr (keep use_msi as deprecated alias).
  - Support offsetted partial MWs in ntb_transport.
  - Hardening for peer-reported interrupt values and minor cleanups.

* PCI Endpoint core and DWC EP controller
  - Add EPC ops map_inbound()/unmap_inbound() for BAR subrange mapping.
  - Implement inbound mapping for DesignWare EP (Address Match mode), with
    tracking of multiple inbound iATU entries per BAR and proper teardown.

* EPF vNTB
  - Add mwN_offset configfs attributes and propagate offsets to inbound maps.
  - Prefer pci_epc_map_inbound() when supported. Otherwise fall back to
    set_bar().
  - Provide .get_pci_epc() so backends can locate the common eDMA instance.

* DW eDMA
  - Add self-interrupt registration and expose test-IRQ register offsets.
  - Provide dw_edma_find_by_child().

* Renesas R-Car
  - Place MW2 in BAR2 to host the interrupt window alongside the data MW.

* Documentation

Patch layout
============

* Patches 01-11 : BAR subrange and MW offsets (EPC/DWC EP, vNTB, core helpers)
* Patches 12-14 : Interrupt handling hardening in ntb_transport/MSI
* Patches 15-17 : DW eDMA: self-IRQ API, offsets, lookup helper
* Patches 18-19 : NTB/EPF glue (.get_pci_epc())
* Patch 20      : Module param name change (use_msi->use_intr, alias preserved)
* Patches 21-23 : Generic interrupt backend + MSI conversion + DW eDMA backend
* Patch 24      : R-Car: add MW2 in BAR2 for interrupts
* Patch 25      : Documentation updates

Tested on
=========

* Renesas R-Car S4 Spider
* Kernel base: commit 68113d260674 ("NTB/msi: Remove unused functions") (ntb-driver-core/ntb-next)

Performance measurement
=======================

Even without the DMA acceleration patches for R-Car S4 (which I keep
separate from this RFC patch series), enabling RC-to-EP interrupts
dramatically improves NTB latency on R-Car S4:

* Before this patch series (NB. use_msi doesn't work on R-Car S4)

  # Server: sockperf server -i 0.0.0.0
  # Client: sockperf ping-pong -i $SERVER_IP
  ========= Printing statistics for Server No: 0
  [Valid Duration] RunTime=0.540 sec; SentMessages=45; ReceivedMessages=45
  ====> avg-latency=5995.680 (std-dev=70.258, mean-ad=57.478, median-ad=85.978,\
        siqr=59.698, cv=0.012, std-error=10.473, 99.0% ci=[5968.702, 6022.658])
  # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
  Summary: Latency is 5995.680 usec
  Total 45 observations; each percentile contains 0.45 observations
  ---> <MAX> observation = 6121.137
  ---> percentile 99.999 = 6121.137
  ---> percentile 99.990 = 6121.137
  ---> percentile 99.900 = 6121.137
  ---> percentile 99.000 = 6121.137
  ---> percentile 90.000 = 6099.178
  ---> percentile 75.000 = 6054.418
  ---> percentile 50.000 = 5993.040
  ---> percentile 25.000 = 5935.021
  ---> <MIN> observation = 5883.362

* With this series (use_intr=1)

  # Server: sockperf server -i 0.0.0.0
  # Client: sockperf ping-pong -i $SERVER_IP
  ========= Printing statistics for Server No: 0
  [Valid Duration] RunTime=0.550 sec; SentMessages=2145; ReceivedMessages=2145
  ====> avg-latency=127.677 (std-dev=21.719, mean-ad=11.759, median-ad=3.779,\
        siqr=2.699, cv=0.170, std-error=0.469, 99.0% ci=[126.469, 128.885])
  # dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
  Summary: Latency is 127.677 usec
  Total 2145 observations; each percentile contains 21.45 observations
  ---> <MAX> observation =  446.691
  ---> percentile 99.999 =  446.691
  ---> percentile 99.990 =  446.691
  ---> percentile 99.900 =  291.234
  ---> percentile 99.000 =  221.515
  ---> percentile 90.000 =  149.277
  ---> percentile 75.000 =  124.497
  ---> percentile 50.000 =  121.137
  ---> percentile 25.000 =  119.037
  ---> <MIN> observation =  113.637

Feedback welcome on both the approach and the splitting/routing preference.

(The series spans NTB, PCI EP/DWC and dmaengine/dw-edma. I'm happy to split
later if preferred.)

Thanks for reviewing.


Koichiro Den (25):
  PCI: endpoint: pci-epf-vntb: Use array_index_nospec() on mws_size[]
    access
  PCI: endpoint: pci-epf-vntb: Add mwN_offset configfs attributes
  NTB: epf: Handle mwN_offset for inbound MW regions
  PCI: endpoint: Add inbound mapping ops to EPC core
  PCI: dwc: ep: Implement EPC inbound mapping support
  PCI: endpoint: pci-epf-vntb: Use pci_epc_map_inbound() for MW mapping
  NTB: Add offset parameter to MW translation APIs
  PCI: endpoint: pci-epf-vntb: Propagate MW offset from configfs when
    present
  NTB: ntb_transport: Support offsetted partial memory windows
  NTB/msi: Support offsetted partial memory window for MSI
  NTB/msi: Do not force MW to its maximum possible size
  NTB: ntb_transport: Stricter checks for peer-reported interrupt values
  NTB/msi: Skip mw_set_trans() if already configured
  NTB/msi: Add a inner loop for PCI-MSI cases
  dmaengine: dw-edma: Add self-interrupt registration API
  dmaengine: dw-edma: Expose self-IRQ register offsets
  dmaengine: dw-edma: Add dw_edma_find_by_child() helper
  NTB: core: Add .get_pci_epc() to ntb_dev_ops
  NTB: epf: vntb: Implement .get_pci_epc() callback
  NTB: ntb_transport: Rename use_msi to use_intr (keep alias)
  NTB: Introduce generic interrupt backend abstraction and convert MSI
  NTB: ntb_transport: Rename MSI symbols to generic interrupt form
  NTB: intr_dw_edma: Add DW eDMA emulated interrupt backend
  NTB: epf: Add MW2 for interrupt use on Renesas R-Car
  Documentation: PCI: endpoint: pci-epf-vntb: Update and add mwN_offset
    usage

 Documentation/PCI/endpoint/pci-vntb-howto.rst |  16 +-
 drivers/dma/dw-edma/dw-edma-core.c            | 109 ++++++++
 drivers/dma/dw-edma/dw-edma-core.h            |  18 ++
 drivers/dma/dw-edma/dw-edma-v0-core.c         |  15 ++
 drivers/ntb/Kconfig                           |  15 ++
 drivers/ntb/Makefile                          |   6 +-
 drivers/ntb/hw/amd/ntb_hw_amd.c               |   6 +-
 drivers/ntb/hw/epf/ntb_hw_epf.c               |  46 ++--
 drivers/ntb/hw/idt/ntb_hw_idt.c               |   3 +-
 drivers/ntb/hw/intel/ntb_hw_gen1.c            |   6 +-
 drivers/ntb/hw/intel/ntb_hw_gen1.h            |   2 +-
 drivers/ntb/hw/intel/ntb_hw_gen3.c            |   3 +-
 drivers/ntb/hw/intel/ntb_hw_gen4.c            |   6 +-
 drivers/ntb/hw/mscc/ntb_hw_switchtec.c        |   6 +-
 drivers/ntb/intr_common.c                     |  61 +++++
 drivers/ntb/intr_dw_edma.c                    | 253 ++++++++++++++++++
 drivers/ntb/msi.c                             | 186 +++++++------
 drivers/ntb/ntb_transport.c                   | 155 ++++++-----
 drivers/ntb/test/ntb_msi_test.c               |  26 +-
 drivers/ntb/test/ntb_perf.c                   |   4 +-
 drivers/ntb/test/ntb_tool.c                   |   6 +-
 .../pci/controller/dwc/pcie-designware-ep.c   | 242 +++++++++++++++--
 drivers/pci/controller/dwc/pcie-designware.c  |   1 +
 drivers/pci/controller/dwc/pcie-designware.h  |   2 +
 drivers/pci/endpoint/functions/pci-epf-vntb.c | 197 ++++++++++++--
 drivers/pci/endpoint/pci-epc-core.c           |  44 +++
 include/linux/dma/edma.h                      |  31 +++
 include/linux/ntb.h                           | 134 +++++++---
 include/linux/pci-epc.h                       |  11 +
 29 files changed, 1310 insertions(+), 300 deletions(-)
 create mode 100644 drivers/ntb/intr_common.c
 create mode 100644 drivers/ntb/intr_dw_edma.c

-- 
2.48.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ