[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251023071916.901355-1-den@valinux.co.jp>
Date: Thu, 23 Oct 2025 16:18:51 +0900
From: Koichiro Den <den@...inux.co.jp>
To: ntb@...ts.linux.dev,
linux-pci@...r.kernel.org,
dmaengine@...r.kernel.org,
linux-kernel@...r.kernel.org
Cc: mani@...nel.org,
kwilczynski@...nel.org,
kishon@...nel.org,
bhelgaas@...gle.com,
corbet@....net,
vkoul@...nel.org,
jdmason@...zu.us,
dave.jiang@...el.com,
allenbh@...il.com,
Basavaraj.Natikar@....com,
Shyam-sundar.S-k@....com,
kurt.schwemmer@...rosemi.com,
logang@...tatee.com,
jingoohan1@...il.com,
lpieralisi@...nel.org,
robh@...nel.org,
jbrunet@...libre.com,
Frank.Li@....com,
fancer.lancer@...il.com,
arnd@...db.de,
pstanner@...hat.com,
elfring@...rs.sourceforge.net
Subject: [RFC PATCH 00/25] NTB/PCI: Add DW eDMA intr fallback and BAR MW offsets
Hi all,
Motivation
==========
On Renesas R-Car S4 the PCIe Endpoint is DesignWare-based and the platform
does not allow mapping GITS_TRANSLATER as an inbound iATU target. As a
result, forwarding MSI writes from the Root Complex (RC) to the Endpoint
(EP) is not possible even if we would add implementation to create a MSI
domain for the vNTB device to use existing drivers/ntb/msi.c, and NTB
traffic must fall back to doorbells (polling). In addition, BAR resources
are scarce, which makes it difficult to dedicate a BAR solely to an
NTB/msi window.
This RFC introduces a generic interrupt backend for NTB. The existing MSI
path is converted to a backend, and a new DW eDMA test-interrupt backend
provides an RC-to-EP interrupt fallback when MSI cannot be used. In
parallel, EPC/DWC gains inbound subrange mapping so multiple NTB memory
windows (MWs) can share a single BAR at arbitrary offsets (via mwN_offset).
The vNTB EPF and ntb_transport are taught about offsets.
Backend selection is automatic: if MSI is available we use the MSI backend.
Otherwise, if enabled, the DW eDMA backend is used. If neither is
available, we continue to use doorbells. Existing systems remain unaffected
unless use_intr=1 is set.
Example layout (R-Car S4):
BAR0: Config/Spad
BAR2 [0x00000-0xF0000]: MW1 (data)
BAR2 [0xF0000-0xF8000]: MW2 (interrupts)
BAR4: Doorbell
# The corresponding configfs settings (see Patch #25):
echo 0xF0000 > ./mw1
echo 0x8000 > ./mw2
echo 0xF0000 > ./mw2_offset
echo 2 > ./mw1_bar
echo 2 > ./mw2_bar
Summary of changes
==================
* NTB core/transport
- Introduce struct ntb_intr_backend and convert MSI to the new backend.
- Add DW eDMA interrupt backend (CONFIG_NTB_DW_EDMA) as MSI-less fallback.
- Rename module parameter to use_intr (keep use_msi as deprecated alias).
- Support offsetted partial MWs in ntb_transport.
- Hardening for peer-reported interrupt values and minor cleanups.
* PCI Endpoint core and DWC EP controller
- Add EPC ops map_inbound()/unmap_inbound() for BAR subrange mapping.
- Implement inbound mapping for DesignWare EP (Address Match mode), with
tracking of multiple inbound iATU entries per BAR and proper teardown.
* EPF vNTB
- Add mwN_offset configfs attributes and propagate offsets to inbound maps.
- Prefer pci_epc_map_inbound() when supported. Otherwise fall back to
set_bar().
- Provide .get_pci_epc() so backends can locate the common eDMA instance.
* DW eDMA
- Add self-interrupt registration and expose test-IRQ register offsets.
- Provide dw_edma_find_by_child().
* Renesas R-Car
- Place MW2 in BAR2 to host the interrupt window alongside the data MW.
* Documentation
Patch layout
============
* Patches 01-11 : BAR subrange and MW offsets (EPC/DWC EP, vNTB, core helpers)
* Patches 12-14 : Interrupt handling hardening in ntb_transport/MSI
* Patches 15-17 : DW eDMA: self-IRQ API, offsets, lookup helper
* Patches 18-19 : NTB/EPF glue (.get_pci_epc())
* Patch 20 : Module param name change (use_msi->use_intr, alias preserved)
* Patches 21-23 : Generic interrupt backend + MSI conversion + DW eDMA backend
* Patch 24 : R-Car: add MW2 in BAR2 for interrupts
* Patch 25 : Documentation updates
Tested on
=========
* Renesas R-Car S4 Spider
* Kernel base: commit 68113d260674 ("NTB/msi: Remove unused functions") (ntb-driver-core/ntb-next)
Performance measurement
=======================
Even without the DMA acceleration patches for R-Car S4 (which I keep
separate from this RFC patch series), enabling RC-to-EP interrupts
dramatically improves NTB latency on R-Car S4:
* Before this patch series (NB. use_msi doesn't work on R-Car S4)
# Server: sockperf server -i 0.0.0.0
# Client: sockperf ping-pong -i $SERVER_IP
========= Printing statistics for Server No: 0
[Valid Duration] RunTime=0.540 sec; SentMessages=45; ReceivedMessages=45
====> avg-latency=5995.680 (std-dev=70.258, mean-ad=57.478, median-ad=85.978,\
siqr=59.698, cv=0.012, std-error=10.473, 99.0% ci=[5968.702, 6022.658])
# dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
Summary: Latency is 5995.680 usec
Total 45 observations; each percentile contains 0.45 observations
---> <MAX> observation = 6121.137
---> percentile 99.999 = 6121.137
---> percentile 99.990 = 6121.137
---> percentile 99.900 = 6121.137
---> percentile 99.000 = 6121.137
---> percentile 90.000 = 6099.178
---> percentile 75.000 = 6054.418
---> percentile 50.000 = 5993.040
---> percentile 25.000 = 5935.021
---> <MIN> observation = 5883.362
* With this series (use_intr=1)
# Server: sockperf server -i 0.0.0.0
# Client: sockperf ping-pong -i $SERVER_IP
========= Printing statistics for Server No: 0
[Valid Duration] RunTime=0.550 sec; SentMessages=2145; ReceivedMessages=2145
====> avg-latency=127.677 (std-dev=21.719, mean-ad=11.759, median-ad=3.779,\
siqr=2.699, cv=0.170, std-error=0.469, 99.0% ci=[126.469, 128.885])
# dropped messages = 0; # duplicated messages = 0; # out-of-order messages = 0
Summary: Latency is 127.677 usec
Total 2145 observations; each percentile contains 21.45 observations
---> <MAX> observation = 446.691
---> percentile 99.999 = 446.691
---> percentile 99.990 = 446.691
---> percentile 99.900 = 291.234
---> percentile 99.000 = 221.515
---> percentile 90.000 = 149.277
---> percentile 75.000 = 124.497
---> percentile 50.000 = 121.137
---> percentile 25.000 = 119.037
---> <MIN> observation = 113.637
Feedback welcome on both the approach and the splitting/routing preference.
(The series spans NTB, PCI EP/DWC and dmaengine/dw-edma. I'm happy to split
later if preferred.)
Thanks for reviewing.
Koichiro Den (25):
PCI: endpoint: pci-epf-vntb: Use array_index_nospec() on mws_size[]
access
PCI: endpoint: pci-epf-vntb: Add mwN_offset configfs attributes
NTB: epf: Handle mwN_offset for inbound MW regions
PCI: endpoint: Add inbound mapping ops to EPC core
PCI: dwc: ep: Implement EPC inbound mapping support
PCI: endpoint: pci-epf-vntb: Use pci_epc_map_inbound() for MW mapping
NTB: Add offset parameter to MW translation APIs
PCI: endpoint: pci-epf-vntb: Propagate MW offset from configfs when
present
NTB: ntb_transport: Support offsetted partial memory windows
NTB/msi: Support offsetted partial memory window for MSI
NTB/msi: Do not force MW to its maximum possible size
NTB: ntb_transport: Stricter checks for peer-reported interrupt values
NTB/msi: Skip mw_set_trans() if already configured
NTB/msi: Add a inner loop for PCI-MSI cases
dmaengine: dw-edma: Add self-interrupt registration API
dmaengine: dw-edma: Expose self-IRQ register offsets
dmaengine: dw-edma: Add dw_edma_find_by_child() helper
NTB: core: Add .get_pci_epc() to ntb_dev_ops
NTB: epf: vntb: Implement .get_pci_epc() callback
NTB: ntb_transport: Rename use_msi to use_intr (keep alias)
NTB: Introduce generic interrupt backend abstraction and convert MSI
NTB: ntb_transport: Rename MSI symbols to generic interrupt form
NTB: intr_dw_edma: Add DW eDMA emulated interrupt backend
NTB: epf: Add MW2 for interrupt use on Renesas R-Car
Documentation: PCI: endpoint: pci-epf-vntb: Update and add mwN_offset
usage
Documentation/PCI/endpoint/pci-vntb-howto.rst | 16 +-
drivers/dma/dw-edma/dw-edma-core.c | 109 ++++++++
drivers/dma/dw-edma/dw-edma-core.h | 18 ++
drivers/dma/dw-edma/dw-edma-v0-core.c | 15 ++
drivers/ntb/Kconfig | 15 ++
drivers/ntb/Makefile | 6 +-
drivers/ntb/hw/amd/ntb_hw_amd.c | 6 +-
drivers/ntb/hw/epf/ntb_hw_epf.c | 46 ++--
drivers/ntb/hw/idt/ntb_hw_idt.c | 3 +-
drivers/ntb/hw/intel/ntb_hw_gen1.c | 6 +-
drivers/ntb/hw/intel/ntb_hw_gen1.h | 2 +-
drivers/ntb/hw/intel/ntb_hw_gen3.c | 3 +-
drivers/ntb/hw/intel/ntb_hw_gen4.c | 6 +-
drivers/ntb/hw/mscc/ntb_hw_switchtec.c | 6 +-
drivers/ntb/intr_common.c | 61 +++++
drivers/ntb/intr_dw_edma.c | 253 ++++++++++++++++++
drivers/ntb/msi.c | 186 +++++++------
drivers/ntb/ntb_transport.c | 155 ++++++-----
drivers/ntb/test/ntb_msi_test.c | 26 +-
drivers/ntb/test/ntb_perf.c | 4 +-
drivers/ntb/test/ntb_tool.c | 6 +-
.../pci/controller/dwc/pcie-designware-ep.c | 242 +++++++++++++++--
drivers/pci/controller/dwc/pcie-designware.c | 1 +
drivers/pci/controller/dwc/pcie-designware.h | 2 +
drivers/pci/endpoint/functions/pci-epf-vntb.c | 197 ++++++++++++--
drivers/pci/endpoint/pci-epc-core.c | 44 +++
include/linux/dma/edma.h | 31 +++
include/linux/ntb.h | 134 +++++++---
include/linux/pci-epc.h | 11 +
29 files changed, 1310 insertions(+), 300 deletions(-)
create mode 100644 drivers/ntb/intr_common.c
create mode 100644 drivers/ntb/intr_dw_edma.c
--
2.48.1
Powered by blists - more mailing lists