[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aS4QkYn+aKphlRFm@lizhi-Precision-Tower-5810>
Date: Mon, 1 Dec 2025 17:02:57 -0500
From: Frank Li <Frank.li@....com>
To: Koichiro Den <den@...inux.co.jp>
Cc: ntb@...ts.linux.dev, linux-pci@...r.kernel.org,
dmaengine@...r.kernel.org, linux-kernel@...r.kernel.org,
mani@...nel.org, kwilczynski@...nel.org, kishon@...nel.org,
bhelgaas@...gle.com, corbet@....net, vkoul@...nel.org,
jdmason@...zu.us, dave.jiang@...el.com, allenbh@...il.com,
Basavaraj.Natikar@....com, Shyam-sundar.S-k@....com,
kurt.schwemmer@...rosemi.com, logang@...tatee.com,
jingoohan1@...il.com, lpieralisi@...nel.org, robh@...nel.org,
jbrunet@...libre.com, fancer.lancer@...il.com, arnd@...db.de,
pstanner@...hat.com, elfring@...rs.sourceforge.net
Subject: Re: [RFC PATCH v2 00/27] NTB transport backed by remote DW eDMA
On Sun, Nov 30, 2025 at 01:03:38AM +0900, Koichiro Den wrote:
> Hi,
>
> This is RFC v2 of the NTB/PCI series for Renesas R-Car S4. The ultimate
> goal is unchanged, i.e. to improve performance between RC and EP
> (with vNTB) over ntb_transport, but the approach has changed drastically.
> Based on the feedback from Frank Li in the v1 thread, in particular:
> https://lore.kernel.org/all/aQEsip3TsPn4LJY9@lizhi-Precision-Tower-5810/
> this RFC v2 instead builds an NTB transport backed by remote eDMA
> architecture and reshapes the series around it. The RC->EP interruption
> is now achieved using a dedicated eDMA read channel, so the somewhat
> "hack"-ish approach in RFC v1 is no longer needed.
>
> Compared to RFC v1, this v2 series enables NTB transport backed by
> remote DW eDMA, so the current ntb_transport handling of Memory Window
> is no longer needed, and direct DMA transfers between EP and RC are
> used.
>
> I realize this is quite a large series. Sorry for the volume, but for
> the RFC stage I believe presenting the full picture in a single set
> helps with reviewing the overall architecture. Once the direction is
> agreed, I will respin it split by subsystem and topic.
>
>
...
>
> - Before this change:
>
> * ping
> 64 bytes from 10.0.0.11: icmp_seq=1 ttl=64 time=12.3 ms
> 64 bytes from 10.0.0.11: icmp_seq=2 ttl=64 time=6.58 ms
> 64 bytes from 10.0.0.11: icmp_seq=3 ttl=64 time=1.26 ms
> 64 bytes from 10.0.0.11: icmp_seq=4 ttl=64 time=7.43 ms
> 64 bytes from 10.0.0.11: icmp_seq=5 ttl=64 time=1.39 ms
> 64 bytes from 10.0.0.11: icmp_seq=6 ttl=64 time=7.38 ms
> 64 bytes from 10.0.0.11: icmp_seq=7 ttl=64 time=1.42 ms
> 64 bytes from 10.0.0.11: icmp_seq=8 ttl=64 time=7.41 ms
>
> * RC->EP (`sudo iperf3 -ub0 -l 65480 -P 2`)
> [ ID] Interval Transfer Bitrate Jitter Lost/Total Datagrams
> [ 5] 0.00-10.01 sec 344 MBytes 288 Mbits/sec 3.483 ms 51/5555 (0.92%) receiver
> [ 6] 0.00-10.01 sec 342 MBytes 287 Mbits/sec 3.814 ms 38/5517 (0.69%) receiver
> [SUM] 0.00-10.01 sec 686 MBytes 575 Mbits/sec 3.648 ms 89/11072 (0.8%) receiver
>
> * EP->RC (`sudo iperf3 -ub0 -l 65480 -P 2`)
> [ 5] 0.00-10.03 sec 334 MBytes 279 Mbits/sec 3.164 ms 390/5731 (6.8%) receiver
> [ 6] 0.00-10.03 sec 334 MBytes 279 Mbits/sec 2.416 ms 396/5741 (6.9%) receiver
> [SUM] 0.00-10.03 sec 667 MBytes 558 Mbits/sec 2.790 ms 786/11472 (6.9%) receiver
>
> Note: with `-P 2`, the best total bitrate (receiver side) was achieved.
>
> - After this change (use_remote_edma=1) [1]:
>
> * ping
> 64 bytes from 10.0.0.11: icmp_seq=1 ttl=64 time=1.48 ms
> 64 bytes from 10.0.0.11: icmp_seq=2 ttl=64 time=1.03 ms
> 64 bytes from 10.0.0.11: icmp_seq=3 ttl=64 time=0.931 ms
> 64 bytes from 10.0.0.11: icmp_seq=4 ttl=64 time=0.910 ms
> 64 bytes from 10.0.0.11: icmp_seq=5 ttl=64 time=1.07 ms
> 64 bytes from 10.0.0.11: icmp_seq=6 ttl=64 time=0.986 ms
> 64 bytes from 10.0.0.11: icmp_seq=7 ttl=64 time=0.910 ms
> 64 bytes from 10.0.0.11: icmp_seq=8 ttl=64 time=0.883 ms
>
> * RC->EP (`sudo iperf3 -ub0 -l 65480 -P 4`)
> [ 5] 0.00-10.01 sec 3.54 GBytes 3.04 Gbits/sec 0.030 ms 0/58007 (0%) receiver
> [ 6] 0.00-10.01 sec 3.71 GBytes 3.19 Gbits/sec 0.453 ms 0/60909 (0%) receiver
> [ 9] 0.00-10.01 sec 3.85 GBytes 3.30 Gbits/sec 0.027 ms 0/63072 (0%) receiver
> [ 11] 0.00-10.01 sec 3.26 GBytes 2.80 Gbits/sec 0.070 ms 1/53512 (0.0019%) receiver
> [SUM] 0.00-10.01 sec 14.4 GBytes 12.3 Gbits/sec 0.145 ms 1/235500 (0.00042%) receiver
>
> * EP->RC (`sudo iperf3 -ub0 -l 65480 -P 4`)
> [ 5] 0.00-10.03 sec 3.40 GBytes 2.91 Gbits/sec 0.104 ms 15467/71208 (22%) receiver
> [ 6] 0.00-10.03 sec 3.08 GBytes 2.64 Gbits/sec 0.176 ms 12097/62609 (19%) receiver
> [ 9] 0.00-10.03 sec 3.38 GBytes 2.90 Gbits/sec 0.270 ms 17212/72710 (24%) receiver
> [ 11] 0.00-10.03 sec 2.56 GBytes 2.19 Gbits/sec 0.200 ms 11193/53090 (21%) receiver
Almost 10x fast, 2.9G vs 279M? high light this one will bring more peopole
interesting about this topic.
> [SUM] 0.00-10.03 sec 12.4 GBytes 10.6 Gbits/sec 0.188 ms 55969/259617 (22%) receiver
>
> [1] configfs settings:
> # modprobe pci_epf_vntb dyndbg=+pmf
> # cd /sys/kernel/config/pci_ep/
> # mkdir functions/pci_epf_vntb/func1
> # echo 0x1912 > functions/pci_epf_vntb/func1/vendorid
> # echo 0x0030 > functions/pci_epf_vntb/func1/deviceid
> # echo 32 > functions/pci_epf_vntb/func1/msi_interrupts
> # echo 16 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/db_count
> # echo 128 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/spad_count
> # echo 2 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/num_mws
> # echo 0xe0000 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw1
> # echo 0x20000 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw2
> # echo 0xe0000 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw2_offset
look like, you try to create sub-small mw windows.
Is it more clean ?
echo 0xe0000 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw1.0
echo 0x20000 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw1.1
so wm1.1 natively continue from prevous one.
Frank
> # echo 0x1912 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/vntb_vid
> # echo 0x0030 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/vntb_pid
> # echo 0x10 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/vbus_number
> # echo 0 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/ctrl_bar
> # echo 4 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/db_bar
> # echo 2 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw1_bar
> # echo 2 > functions/pci_epf_vntb/func1/pci_epf_vntb.0/mw2_bar
> # ln -s controllers/e65d0000.pcie-ep functions/pci_epf_vntb/func1/primary/
> # echo 1 > controllers/e65d0000.pcie-ep/start
>
>
> Thanks for taking a look.
>
>
> Koichiro Den (27):
> PCI: endpoint: pci-epf-vntb: Use array_index_nospec() on mws_size[]
> access
> PCI: endpoint: pci-epf-vntb: Add mwN_offset configfs attributes
> NTB: epf: Handle mwN_offset for inbound MW regions
> PCI: endpoint: Add inbound mapping ops to EPC core
> PCI: dwc: ep: Implement EPC inbound mapping support
> PCI: endpoint: pci-epf-vntb: Use pci_epc_map_inbound() for MW mapping
> NTB: Add offset parameter to MW translation APIs
> PCI: endpoint: pci-epf-vntb: Propagate MW offset from configfs when
> present
> NTB: ntb_transport: Support offsetted partial memory windows
> NTB: core: Add .get_pci_epc() to ntb_dev_ops
> NTB: epf: vntb: Implement .get_pci_epc() callback
> damengine: dw-edma: Fix MSI data values for multi-vector IMWr
> interrupts
> NTB: ntb_transport: Use seq_file for QP stats debugfs
> NTB: ntb_transport: Move TX memory window setup into setup_qp_mw()
> NTB: ntb_transport: Dynamically determine qp count
> NTB: ntb_transport: Introduce get_dma_dev() helper
> NTB: epf: Reserve a subset of MSI vectors for non-NTB users
> NTB: ntb_transport: Introduce ntb_transport_backend_ops
> PCI: dwc: ep: Cache MSI outbound iATU mapping
> NTB: ntb_transport: Introduce remote eDMA backed transport mode
> NTB: epf: Provide db_vector_count/db_vector_mask callbacks
> ntb_netdev: Multi-queue support
> NTB: epf: Add per-SoC quirk to cap MRRS for DWC eDMA (128B for R-Car)
> iommu: ipmmu-vmsa: Add PCIe ch0 to devices_allowlist
> iommu: ipmmu-vmsa: Add support for reserved regions
> arm64: dts: renesas: Add Spider RC/EP DTs for NTB with remote DW PCIe
> eDMA
> NTB: epf: Add an additional memory window (MW2) barno mapping on
> Renesas R-Car
>
> arch/arm64/boot/dts/renesas/Makefile | 2 +
> .../boot/dts/renesas/r8a779f0-spider-ep.dts | 46 +
> .../boot/dts/renesas/r8a779f0-spider-rc.dts | 52 +
> drivers/dma/dw-edma/dw-edma-core.c | 28 +-
> drivers/iommu/ipmmu-vmsa.c | 7 +-
> drivers/net/ntb_netdev.c | 341 ++-
> drivers/ntb/Kconfig | 11 +
> drivers/ntb/Makefile | 3 +
> drivers/ntb/hw/amd/ntb_hw_amd.c | 6 +-
> drivers/ntb/hw/epf/ntb_hw_epf.c | 177 +-
> drivers/ntb/hw/idt/ntb_hw_idt.c | 3 +-
> drivers/ntb/hw/intel/ntb_hw_gen1.c | 6 +-
> drivers/ntb/hw/intel/ntb_hw_gen1.h | 2 +-
> drivers/ntb/hw/intel/ntb_hw_gen3.c | 3 +-
> drivers/ntb/hw/intel/ntb_hw_gen4.c | 6 +-
> drivers/ntb/hw/mscc/ntb_hw_switchtec.c | 6 +-
> drivers/ntb/msi.c | 6 +-
> drivers/ntb/ntb_edma.c | 628 ++++++
> drivers/ntb/ntb_edma.h | 128 ++
> .../{ntb_transport.c => ntb_transport_core.c} | 1829 ++++++++++++++---
> drivers/ntb/test/ntb_perf.c | 4 +-
> drivers/ntb/test/ntb_tool.c | 6 +-
> .../pci/controller/dwc/pcie-designware-ep.c | 287 ++-
> drivers/pci/controller/dwc/pcie-designware.h | 7 +
> drivers/pci/endpoint/functions/pci-epf-vntb.c | 229 ++-
> drivers/pci/endpoint/pci-epc-core.c | 44 +
> include/linux/ntb.h | 39 +-
> include/linux/ntb_transport.h | 21 +
> include/linux/pci-epc.h | 11 +
> 29 files changed, 3415 insertions(+), 523 deletions(-)
> create mode 100644 arch/arm64/boot/dts/renesas/r8a779f0-spider-ep.dts
> create mode 100644 arch/arm64/boot/dts/renesas/r8a779f0-spider-rc.dts
> create mode 100644 drivers/ntb/ntb_edma.c
> create mode 100644 drivers/ntb/ntb_edma.h
> rename drivers/ntb/{ntb_transport.c => ntb_transport_core.c} (59%)
>
> --
> 2.48.1
>
Powered by blists - more mailing lists