lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 8 Jul 2020 19:22:05 +0800
From:   Jason Wang <jasowang@...hat.com>
To:     Kishon Vijay Abraham I <kishon@...com>
Cc:     "Michael S. Tsirkin" <mst@...hat.com>,
        Ohad Ben-Cohen <ohad@...ery.com>,
        Bjorn Andersson <bjorn.andersson@...aro.org>,
        Jon Mason <jdmason@...zu.us>,
        Dave Jiang <dave.jiang@...el.com>,
        Allen Hubbe <allenbh@...il.com>,
        Lorenzo Pieralisi <lorenzo.pieralisi@....com>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Stefan Hajnoczi <stefanha@...hat.com>,
        Stefano Garzarella <sgarzare@...hat.com>,
        linux-doc@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-remoteproc@...r.kernel.org, linux-ntb@...glegroups.com,
        linux-pci@...r.kernel.org, kvm@...r.kernel.org,
        virtualization@...ts.linux-foundation.org, netdev@...r.kernel.org
Subject: Re: [RFC PATCH 00/22] Enhance VHOST to enable SoC-to-SoC
 communication


On 2020/7/7 下午10:45, Kishon Vijay Abraham I wrote:
> Hi Jason,
>
> On 7/7/2020 3:17 PM, Jason Wang wrote:
>> On 2020/7/6 下午5:32, Kishon Vijay Abraham I wrote:
>>> Hi Jason,
>>>
>>> On 7/3/2020 12:46 PM, Jason Wang wrote:
>>>> On 2020/7/2 下午9:35, Kishon Vijay Abraham I wrote:
>>>>> Hi Jason,
>>>>>
>>>>> On 7/2/2020 3:40 PM, Jason Wang wrote:
>>>>>> On 2020/7/2 下午5:51, Michael S. Tsirkin wrote:
>>>>>>> On Thu, Jul 02, 2020 at 01:51:21PM +0530, Kishon Vijay Abraham I wrote:
>>>>>>>> This series enhances Linux Vhost support to enable SoC-to-SoC
>>>>>>>> communication over MMIO. This series enables rpmsg communication between
>>>>>>>> two SoCs using both PCIe RC<->EP and HOST1-NTB-HOST2
>>>>>>>>
>>>>>>>> 1) Modify vhost to use standard Linux driver model
>>>>>>>> 2) Add support in vring to access virtqueue over MMIO
>>>>>>>> 3) Add vhost client driver for rpmsg
>>>>>>>> 4) Add PCIe RC driver (uses virtio) and PCIe EP driver (uses vhost) for
>>>>>>>>        rpmsg communication between two SoCs connected to each other
>>>>>>>> 5) Add NTB Virtio driver and NTB Vhost driver for rpmsg communication
>>>>>>>>        between two SoCs connected via NTB
>>>>>>>> 6) Add configfs to configure the components
>>>>>>>>
>>>>>>>> UseCase1 :
>>>>>>>>
>>>>>>>>      VHOST RPMSG                     VIRTIO RPMSG
>>>>>>>>           +                               +
>>>>>>>>           |                               |
>>>>>>>>           |                               |
>>>>>>>>           |                               |
>>>>>>>>           |                               |
>>>>>>>> +-----v------+                 +------v-------+
>>>>>>>> |   Linux    |                 |     Linux    |
>>>>>>>> |  Endpoint  |                 | Root Complex |
>>>>>>>> |            <----------------->              |
>>>>>>>> |            |                 |              |
>>>>>>>> |    SOC1    |                 |     SOC2     |
>>>>>>>> +------------+                 +--------------+
>>>>>>>>
>>>>>>>> UseCase 2:
>>>>>>>>
>>>>>>>>          VHOST RPMSG                                      VIRTIO RPMSG
>>>>>>>>               +                                                 +
>>>>>>>>               |                                                 |
>>>>>>>>               |                                                 |
>>>>>>>>               |                                                 |
>>>>>>>>               |                                                 |
>>>>>>>>        +------v------+                                   +------v------+
>>>>>>>>        |             |                                   |             |
>>>>>>>>        |    HOST1    |                                   |    HOST2    |
>>>>>>>>        |             |                                   |             |
>>>>>>>>        +------^------+                                   +------^------+
>>>>>>>>               |                                                 |
>>>>>>>>               |                                                 |
>>>>>>>> +---------------------------------------------------------------------+
>>>>>>>> |  +------v------+                                   +------v------+  |
>>>>>>>> |  |             |                                   |             |  |
>>>>>>>> |  |     EP      |                                   |     EP      |  |
>>>>>>>> |  | CONTROLLER1 |                                   | CONTROLLER2 |  |
>>>>>>>> |  |             <----------------------------------->             |  |
>>>>>>>> |  |             |                                   |             |  |
>>>>>>>> |  |             |                                   |             |  |
>>>>>>>> |  |             |  SoC With Multiple EP Instances   |             |  |
>>>>>>>> |  |             |  (Configured using NTB Function)  |             |  |
>>>>>>>> |  +-------------+                                   +-------------+  |
>>>>>>>> +---------------------------------------------------------------------+
>>>>>>>>
>>>>>>>> Software Layering:
>>>>>>>>
>>>>>>>> The high-level SW layering should look something like below. This series
>>>>>>>> adds support only for RPMSG VHOST, however something similar should be
>>>>>>>> done for net and scsi. With that any vhost device (PCI, NTB, Platform
>>>>>>>> device, user) can use any of the vhost client driver.
>>>>>>>>
>>>>>>>>
>>>>>>>>         +----------------+  +-----------+  +------------+  +----------+
>>>>>>>>         |  RPMSG VHOST   |  | NET VHOST |  | SCSI VHOST |  |    X     |
>>>>>>>>         +-------^--------+  +-----^-----+  +-----^------+  +----^-----+
>>>>>>>>                 |                 |              |              |
>>>>>>>>                 |                 |              |              |
>>>>>>>>                 |                 |              |              |
>>>>>>>> +-----------v-----------------v--------------v--------------v----------+
>>>>>>>> |                            VHOST CORE                                |
>>>>>>>> +--------^---------------^--------------------^------------------^-----+
>>>>>>>>              |               |                    |                  |
>>>>>>>>              |               |                    |                  |
>>>>>>>>              |               |                    |                  |
>>>>>>>> +--------v-------+  +----v------+  +----------v----------+  +----v-----+
>>>>>>>> |  PCI EPF VHOST |  | NTB VHOST |  |PLATFORM DEVICE VHOST|  |    X     |
>>>>>>>> +----------------+  +-----------+  +---------------------+  +----------+
>>>>>>>>
>>>>>>>> This was initially proposed here [1]
>>>>>>>>
>>>>>>>> [1] ->
>>>>>>>> https://lore.kernel.org/r/2cf00ec4-1ed6-f66e-6897-006d1a5b6390@ti.com
>>>>>>> I find this very interesting. A huge patchset so will take a bit
>>>>>>> to review, but I certainly plan to do that. Thanks!
>>>>>> Yes, it would be better if there's a git branch for us to have a look.
>>>>> I've pushed the branch
>>>>> https://github.com/kishon/linux-wip.git vhost_rpmsg_pci_ntb_rfc
>>>> Thanks
>>>>
>>>>
>>>>>> Btw, I'm not sure I get the big picture, but I vaguely feel some of the
>>>>>> work is
>>>>>> duplicated with vDPA (e.g the epf transport or vhost bus).
>>>>> This is about connecting two different HW systems both running Linux and
>>>>> doesn't necessarily involve virtualization.
>>>> Right, this is something similar to VOP
>>>> (Documentation/misc-devices/mic/mic_overview.rst). The different is the
>>>> hardware I guess and VOP use userspace application to implement the device.
>>> I'd also like to point out, this series tries to have communication between two
>>> SoCs in vendor agnostic way. Since this series solves for 2 usecases (PCIe
>>> RC<->EP and NTB), for the NTB case it directly plugs into NTB framework and any
>>> of the HW in NTB below should be able to use a virtio-vhost communication
>>>
>>> #ls drivers/ntb/hw/
>>> amd  epf  idt  intel  mscc
>>>
>>> And similarly for the PCIe RC<->EP communication, this adds a generic endpoint
>>> function driver and hence any SoC that supports configurable PCIe endpoint can
>>> use virtio-vhost communication
>>>
>>> # ls drivers/pci/controller/dwc/*ep*
>>> drivers/pci/controller/dwc/pcie-designware-ep.c
>>> drivers/pci/controller/dwc/pcie-uniphier-ep.c
>>> drivers/pci/controller/dwc/pci-layerscape-ep.c
>>
>> Thanks for those backgrounds.
>>
>>
>>>>>     So there is no guest or host as in
>>>>> virtualization but two entirely different systems connected via PCIe cable,
>>>>> one
>>>>> acting as guest and one as host. So one system will provide virtio
>>>>> functionality reserving memory for virtqueues and the other provides vhost
>>>>> functionality providing a way to access the virtqueues in virtio memory.
>>>>> One is
>>>>> source and the other is sink and there is no intermediate entity. (vhost was
>>>>> probably intermediate entity in virtualization?)
>>>> (Not a native English speaker) but "vhost" could introduce some confusion for
>>>> me since it was use for implementing virtio backend for userspace drivers. I
>>>> guess "vringh" could be better.
>>> Initially I had named this vringh but later decided to choose vhost instead of
>>> vringh. vhost is still a virtio backend (not necessarily userspace) though it
>>> now resides in an entirely different system. Whatever virtio is for a frontend
>>> system, vhost can be that for a backend system. vring can be for accessing
>>> virtqueue and can be used either in frontend or backend.
>>
>> Ok.
>>
>>
>>>>>> Have you considered to implement these through vDPA?
>>>>> IIUC vDPA only provides an interface to userspace and an in-kernel rpmsg
>>>>> driver
>>>>> or vhost net driver is not provided.
>>>>>
>>>>> The HW connection looks something like https://pasteboard.co/JfMVVHC.jpg
>>>>> (usecase2 above),
>>>> I see.
>>>>
>>>>
>>>>>     all the boards run Linux. The middle board provides NTB
>>>>> functionality and board on either side provides virtio/vhost functionality and
>>>>> transfer data using rpmsg.
>>>> So I wonder whether it's worthwhile for a new bus. Can we use the existed
>>>> virtio-bus/drivers? It might work as, except for the epf transport, we can
>>>> introduce a epf "vhost" transport driver.
>>> IMHO we'll need two buses one for frontend and other for backend because the
>>> two components can then co-operate/interact with each other to provide a
>>> functionality. Though both will seemingly provide similar callbacks, they are
>>> both provide symmetrical or complimentary funcitonality and need not be same or
>>> identical.
>>>
>>> Having the same bus can also create sequencing issues.
>>>
>>> If you look at virtio_dev_probe() of virtio_bus
>>>
>>> device_features = dev->config->get_features(dev);
>>>
>>> Now if we use same bus for both front-end and back-end, both will try to
>>> get_features when there has been no set_features. Ideally vhost device should
>>> be initialized first with the set of features it supports. Vhost and virtio
>>> should use "status" and "features" complimentarily and not identically.
>>
>> Yes, but there's no need for doing status/features passthrough in epf vhost
>> drivers.b
>>
>>
>>> virtio device (or frontend) cannot be initialized before vhost device (or
>>> backend) gets initialized with data such as features. Similarly vhost (backend)
>>> cannot access virqueues or buffers before virtio (frontend) sets
>>> VIRTIO_CONFIG_S_DRIVER_OK whereas that requirement is not there for virtio as
>>> the physical memory for virtqueues are created by virtio (frontend).
>>
>> epf vhost drivers need to implement two devices: vhost(vringh) device and
>> virtio device (which is a mediated device). The vhost(vringh) device is doing
>> feature negotiation with the virtio device via RC/EP or NTB. The virtio device
>> is doing feature negotiation with local virtio drivers. If there're feature
>> mismatch, epf vhost drivers and do mediation between them.
> Here epf vhost should be initialized with a set of features for it to negotiate
> either as vhost device or virtio device no? Where should the initial feature
> set for epf vhost come from?


I think it can work as:

1) Having an initial features (hard coded in the code) set X in epf vhost
2) Using this X for both virtio device and vhost(vringh) device
3) local virtio driver will negotiate with virtio device with feature set Y
4) remote virtio driver will negotiate with vringh device with feature set Z
5) mediate between feature Y and feature Z since both Y and Z are a 
subset of X


>>
>>>> It will have virtqueues but only used for the communication between itself and
>>>> uppter virtio driver. And it will have vringh queues which will be probe by
>>>> virtio epf transport drivers. And it needs to do datacopy between virtqueue and
>>>> vringh queues.
>>>>
>>>> It works like:
>>>>
>>>> virtio drivers <- virtqueue/virtio-bus -> epf vhost drivers <- vringh
>>>> queue/epf>
>>>>
>>>> The advantages is that there's no need for writing new buses and drivers.
>>> I think this will work however there is an addtional copy between vringh queue
>>> and virtqueue,
>>
>> I think not? E.g in use case 1), if we stick to virtio bus, we will have:
>>
>> virtio-rpmsg (EP) <- virtio ring(1) -> epf vhost driver (EP) <- virtio ring(2)
>> -> virtio pci (RC) <-> virtio rpmsg (RC)
> IIUC epf vhost driver (EP) will access virtio ring(2) using vringh?


Yes.


> And virtio
> ring(2) is created by virtio pci (RC).


Yes.


>> What epf vhost driver did is to read from virtio ring(1) about the buffer len
>> and addr and them DMA to Linux(RC)?
> okay, I made some optimization here where vhost-rpmsg using a helper writes a
> buffer from rpmsg's upper layer directly to remote Linux (RC) as against here
> were it has to be first written to virtio ring (1).
>
> Thinking how this would look for NTB
> virtio-rpmsg (HOST1) <- virtio ring(1) -> NTB(HOST1) <-> NTB(HOST2)  <- virtio
> ring(2) -> virtio-rpmsg (HOST2)
>
> Here the NTB(HOST1) will access the virtio ring(2) using vringh?


Yes, I think so it needs to use vring to access virtio ring (1) as well.


>
> Do you also think this will work seamlessly with virtio_net.c, virtio_blk.c?


Yes.


>
> I'd like to get clarity on two things in the approach you suggested, one is
> features (since epf vhost should ideally be transparent to any virtio driver)


We can have have an array of pre-defined features indexed by virtio 
device id in the code.


> and the other is how certain inputs to virtio device such as number of buffers
> be determined.


We can start from hard coded the value like 256, or introduce some API 
for user to change the value.


>
> Thanks again for your suggestions!


You're welcome.

Note that I just want to check whether or not we can reuse the virtio 
bus/driver. It's something similar to what you proposed in Software 
Layering but we just replace "vhost core" with "virtio bus" and move the 
vhost core below epf/ntb/platform transport.

Thanks


>
> Regards
> Kishon
>
>>
>>> in some cases adds latency because of forwarding interrupts
>>> between vhost and virtio driver, vhost drivers providing features (which means
>>> it has to be aware of which virtio driver will be connected).
>>> virtio drivers (front end) generally access the buffers from it's local memory
>>> but when in backend it can access over MMIO (like PCI EPF or NTB) or userspace.
>>>> Does this make sense?
>>> Two copies in my opinion is an issue but lets get others opinions as well.
>>
>> Sure.
>>
>>
>>> Thanks for your suggestions!
>>
>> You're welcome.
>>
>> Thanks
>>
>>
>>> Regards
>>> Kishon
>>>
>>>> Thanks
>>>>
>>>>
>>>>> Thanks
>>>>> Kishon
>>>>>
>>>>>> Thanks
>>>>>>
>>>>>>
>>>>>>>> Kishon Vijay Abraham I (22):
>>>>>>>>       vhost: Make _feature_ bits a property of vhost device
>>>>>>>>       vhost: Introduce standard Linux driver model in VHOST
>>>>>>>>       vhost: Add ops for the VHOST driver to configure VHOST device
>>>>>>>>       vringh: Add helpers to access vring in MMIO
>>>>>>>>       vhost: Add MMIO helpers for operations on vhost virtqueue
>>>>>>>>       vhost: Introduce configfs entry for configuring VHOST
>>>>>>>>       virtio_pci: Use request_threaded_irq() instead of request_irq()
>>>>>>>>       rpmsg: virtio_rpmsg_bus: Disable receive virtqueue callback when
>>>>>>>>         reading messages
>>>>>>>>       rpmsg: Introduce configfs entry for configuring rpmsg
>>>>>>>>       rpmsg: virtio_rpmsg_bus: Add Address Service Notification support
>>>>>>>>       rpmsg: virtio_rpmsg_bus: Move generic rpmsg structure to
>>>>>>>>         rpmsg_internal.h
>>>>>>>>       virtio: Add ops to allocate and free buffer
>>>>>>>>       rpmsg: virtio_rpmsg_bus: Use virtio_alloc_buffer() and
>>>>>>>>         virtio_free_buffer()
>>>>>>>>       rpmsg: Add VHOST based remote processor messaging bus
>>>>>>>>       samples/rpmsg: Setup delayed work to send message
>>>>>>>>       samples/rpmsg: Wait for address to be bound to rpdev for sending
>>>>>>>>         message
>>>>>>>>       rpmsg.txt: Add Documentation to configure rpmsg using configfs
>>>>>>>>       virtio_pci: Add VIRTIO driver for VHOST on Configurable PCIe Endpoint
>>>>>>>>         device
>>>>>>>>       PCI: endpoint: Add EP function driver to provide VHOST interface
>>>>>>>>       NTB: Add a new NTB client driver to implement VIRTIO functionality
>>>>>>>>       NTB: Add a new NTB client driver to implement VHOST functionality
>>>>>>>>       NTB: Describe the ntb_virtio and ntb_vhost client in the documentation
>>>>>>>>
>>>>>>>>      Documentation/driver-api/ntb.rst              |   11 +
>>>>>>>>      Documentation/rpmsg.txt                       |   56 +
>>>>>>>>      drivers/ntb/Kconfig                           |   18 +
>>>>>>>>      drivers/ntb/Makefile                          |    2 +
>>>>>>>>      drivers/ntb/ntb_vhost.c                       |  776 +++++++++++
>>>>>>>>      drivers/ntb/ntb_virtio.c                      |  853 ++++++++++++
>>>>>>>>      drivers/ntb/ntb_virtio.h                      |   56 +
>>>>>>>>      drivers/pci/endpoint/functions/Kconfig        |   11 +
>>>>>>>>      drivers/pci/endpoint/functions/Makefile       |    1 +
>>>>>>>>      .../pci/endpoint/functions/pci-epf-vhost.c    | 1144 ++++++++++++++++
>>>>>>>>      drivers/rpmsg/Kconfig                         |   10 +
>>>>>>>>      drivers/rpmsg/Makefile                        |    3 +-
>>>>>>>>      drivers/rpmsg/rpmsg_cfs.c                     |  394 ++++++
>>>>>>>>      drivers/rpmsg/rpmsg_core.c                    |    7 +
>>>>>>>>      drivers/rpmsg/rpmsg_internal.h                |  136 ++
>>>>>>>>      drivers/rpmsg/vhost_rpmsg_bus.c               | 1151 +++++++++++++++++
>>>>>>>>      drivers/rpmsg/virtio_rpmsg_bus.c              |  184 ++-
>>>>>>>>      drivers/vhost/Kconfig                         |    1 +
>>>>>>>>      drivers/vhost/Makefile                        |    2 +-
>>>>>>>>      drivers/vhost/net.c                           |   10 +-
>>>>>>>>      drivers/vhost/scsi.c                          |   24 +-
>>>>>>>>      drivers/vhost/test.c                          |   17 +-
>>>>>>>>      drivers/vhost/vdpa.c                          |    2 +-
>>>>>>>>      drivers/vhost/vhost.c                         |  730 ++++++++++-
>>>>>>>>      drivers/vhost/vhost_cfs.c                     |  341 +++++
>>>>>>>>      drivers/vhost/vringh.c                        |  332 +++++
>>>>>>>>      drivers/vhost/vsock.c                         |   20 +-
>>>>>>>>      drivers/virtio/Kconfig                        |    9 +
>>>>>>>>      drivers/virtio/Makefile                       |    1 +
>>>>>>>>      drivers/virtio/virtio_pci_common.c            |   25 +-
>>>>>>>>      drivers/virtio/virtio_pci_epf.c               |  670 ++++++++++
>>>>>>>>      include/linux/mod_devicetable.h               |    6 +
>>>>>>>>      include/linux/rpmsg.h                         |    6 +
>>>>>>>>      {drivers/vhost => include/linux}/vhost.h      |  132 +-
>>>>>>>>      include/linux/virtio.h                        |    3 +
>>>>>>>>      include/linux/virtio_config.h                 |   42 +
>>>>>>>>      include/linux/vringh.h                        |   46 +
>>>>>>>>      samples/rpmsg/rpmsg_client_sample.c           |   32 +-
>>>>>>>>      tools/virtio/virtio_test.c                    |    2 +-
>>>>>>>>      39 files changed, 7083 insertions(+), 183 deletions(-)
>>>>>>>>      create mode 100644 drivers/ntb/ntb_vhost.c
>>>>>>>>      create mode 100644 drivers/ntb/ntb_virtio.c
>>>>>>>>      create mode 100644 drivers/ntb/ntb_virtio.h
>>>>>>>>      create mode 100644 drivers/pci/endpoint/functions/pci-epf-vhost.c
>>>>>>>>      create mode 100644 drivers/rpmsg/rpmsg_cfs.c
>>>>>>>>      create mode 100644 drivers/rpmsg/vhost_rpmsg_bus.c
>>>>>>>>      create mode 100644 drivers/vhost/vhost_cfs.c
>>>>>>>>      create mode 100644 drivers/virtio/virtio_pci_epf.c
>>>>>>>>      rename {drivers/vhost => include/linux}/vhost.h (66%)
>>>>>>>>
>>>>>>>> -- 
>>>>>>>> 2.17.1
>>>>>>>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ