[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20230215082343.GA6224@thinkpad>
Date: Wed, 15 Feb 2023 13:53:43 +0530
From: Manivannan Sadhasivam <mani@...nel.org>
To: Frank Li <Frank.Li@....com>
Cc: mie@...l.co.jp, imx@...ts.linux.dev, bhelgaas@...gle.com,
jasowang@...hat.com, jdmason@...zu.us, kishon@...nel.org,
kw@...ux.com, linux-kernel@...r.kernel.org,
linux-pci@...r.kernel.org, lpieralisi@...nel.org, mani@...nel.org,
mst@...hat.com, renzhijie2@...wei.com, taki@...l.co.jp,
virtualization@...ts.linux-foundation.org
Subject: Re: PCIe RC\EP virtio rdma solution discussion.
On Tue, Feb 07, 2023 at 02:45:27PM -0500, Frank Li wrote:
> From: Frank Li <Frank.li@....com>
>
> Recently more and more people are interested in PCI RC and EP connection,
> especially network usage cases. I upstreamed a vntb solution last year.
> But the transfer speed is not good enough. I initialized a discussion
> at https://lore.kernel.org/imx/d098a631-9930-26d3-48f3-8f95386c8e50@ti.com/T/#t
>
> ┌─────────────────────────────────┐ ┌──────────────┐
> │ │ │ │
> │ │ │ │
> │ VirtQueue RX │ │ VirtQueue │
> │ TX ┌──┐ │ │ TX │
> │ ┌─────────┐ │ │ │ │ ┌─────────┐ │
> │ │ SRC LEN ├─────┐ ┌──┤ │◄────┼───┼─┤ SRC LEN │ │
> │ ├─────────┤ │ │ │ │ │ │ ├─────────┤ │
> │ │ │ │ │ │ │ │ │ │ │ │
> │ ├─────────┤ │ │ │ │ │ │ ├─────────┤ │
> │ │ │ │ │ │ │ │ │ │ │ │
> │ └─────────┘ │ │ └──┘ │ │ └─────────┘ │
> │ │ │ │ │ │
> │ RX ┌───┼──┘ TX │ │ RX │
> │ ┌─────────┐ │ │ ┌──┐ │ │ ┌─────────┐ │
> │ │ │◄┘ └────►│ ├─────┼───┼─┤ │ │
> │ ├─────────┤ │ │ │ │ ├─────────┤ │
> │ │ │ │ │ │ │ │ │ │
> │ ├─────────┤ │ │ │ │ ├─────────┤ │
> │ │ │ │ │ │ │ │ │ │
> │ └─────────┘ │ │ │ │ └─────────┘ │
> │ virtio_net └──┘ │ │ virtio_net │
> │ Virtual PCI BUS EDMA Queue │ │ │
> ├─────────────────────────────────┤ │ │
> │ PCI EP Controller with eDMA │ │ PCI Host │
> └─────────────────────────────────┘ └──────────────┘
>
> Basic idea is
> 1. Both EP and host probe virtio_net driver
> 2. There are two queues, one is the EP side(EQ), the other is the Host side.
> 3. EP side epf driver map Host side’s queue into EP’s space. Called HQ.
> 4. One working thread
> 5. pick one TX from EQ and RX from HQ, combine and generate EDMA requests,
> and put into the DMA TX queue.
> 6. Pick one RX from EQ and TX from HQ, combine and generate EDMA requests,
> and put into the DMA RX queue.
> 7. EDMA done irq will mark related item in EP and HQ finished.
>
> The whole transfer is zero copied and uses a DMA queue.
>
> The Shunsuke Mie implemented the above idea.
> https://lore.kernel.org/linux-pci/CANXvt5q_qgLuAfF7dxxrqUirT_Ld4B=POCq8JcB9uPRvCGDiKg@mail.gmail.com/T/#t
>
>
> Similar solution posted at 2019, except use memcpy from/to PCI EP map windows.
> Using DMA should be simpler because EDMA can access the whole HOST\EP side memory space.
> https://lore.kernel.org/linux-pci/9f8e596f-b601-7f97-a98a-111763f966d1@ti.com/T/
>
> Solution 1 (Based on shunsuke):
>
> Both EP and Host side use virtio.
> Using EDMA to simplify data transfer and improve transfer speed.
> RDMA implement based on RoCE
> - proposal: https://lore.kernel.org/all/20220511095900.343-1-xieyongji@bytedance.com/T/
> - presentation on kvm forum: https://youtu.be/Qrhv6hC_YK4
>
> Solution 2(2020, Kishon)
>
> Previous https://lore.kernel.org/linux-pci/20200702082143.25259-1-kishon@ti.com/
> EP side use vhost, RC side use virtio.
> I don’t think anyone works on this thread now.
> If using eDMA, it needs both sides to have a transfer queue.
> I don't know how to easily implement it on the vhost side.
>
> Solution 3(I am working on)
>
> Implement infiniband rdma driver at both EP and RC side.
> EP side build EDMA hardware queue based on EP/RC side’s send and receive
> queue and when eDMA finished, write status to complete queue for both EP/RC
> side. Use ipoib implement network transfer.
>
>
> The whole upstream effort is quite huge for these. I don’t want to waste
> time and efforts because direction is wrong.
>
> I think Solution 1 is an easy path.
>
I didn't had time to look into Shunsuke's series, but from the initial look
of the proposed solutions, option 1 seems to be the best for me.
Thanks,
Mani
>
>
--
மணிவண்ணன் சதாசிவம்
Powered by blists - more mailing lists