lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 2 Jun 2022 11:53:39 +0800
From:   Jason Wang <jasowang@...hat.com>
To:     Parav Pandit <parav@...dia.com>
Cc:     "Michael S. Tsirkin" <mst@...hat.com>,
        Eugenio PĂ©rez <eperezma@...hat.com>,
        "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
        "virtualization@...ts.linux-foundation.org" 
        <virtualization@...ts.linux-foundation.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "martinh@...inx.com" <martinh@...inx.com>,
        Stefano Garzarella <sgarzare@...hat.com>,
        "martinpo@...inx.com" <martinpo@...inx.com>,
        "lvivier@...hat.com" <lvivier@...hat.com>,
        "pabloc@...inx.com" <pabloc@...inx.com>,
        Eli Cohen <elic@...dia.com>,
        Dan Carpenter <dan.carpenter@...cle.com>,
        Xie Yongji <xieyongji@...edance.com>,
        Christophe JAILLET <christophe.jaillet@...adoo.fr>,
        Zhang Min <zhang.min9@....com.cn>,
        Wu Zongyong <wuzongyong@...ux.alibaba.com>,
        "lulu@...hat.com" <lulu@...hat.com>,
        Zhu Lingshan <lingshan.zhu@...el.com>,
        "Piotr.Uminski@...el.com" <Piotr.Uminski@...el.com>,
        Si-Wei Liu <si-wei.liu@...cle.com>,
        "ecree.xilinx@...il.com" <ecree.xilinx@...il.com>,
        "gautam.dawar@....com" <gautam.dawar@....com>,
        "habetsm.xilinx@...il.com" <habetsm.xilinx@...il.com>,
        "tanuj.kamde@....com" <tanuj.kamde@....com>,
        "hanand@...inx.com" <hanand@...inx.com>,
        "dinang@...inx.com" <dinang@...inx.com>,
        Longpeng <longpeng2@...wei.com>
Subject: Re: [PATCH v4 0/4] Implement vdpasim stop operation

On Thu, Jun 2, 2022 at 10:59 AM Parav Pandit <parav@...dia.com> wrote:
>
>
> > From: Jason Wang <jasowang@...hat.com>
> > Sent: Wednesday, June 1, 2022 10:00 PM
> >
> > On Thu, Jun 2, 2022 at 2:58 AM Parav Pandit <parav@...dia.com> wrote:
> > >
> > >
> > > > From: Jason Wang <jasowang@...hat.com>
> > > > Sent: Tuesday, May 31, 2022 10:42 PM
> > > >
> > > > Well, the ability to query the virtqueue state was proposed as
> > > > another feature (Eugenio, please correct me). This should be
> > > > sufficient for making virtio-net to be live migrated.
> > > >
> > > The device is stopped, it won't answer to this special vq config done here.
> >
> > This depends on the definition of the stop. Any query to the device state
> > should be allowed otherwise it's meaningless for us.
> >
> > > Programming all of these using cfg registers doesn't scale for on-chip
> > memory and for the speed.
> >
> > Well, they are orthogonal and what I want to say is, we should first define
> > the semantics of stop and state of the virtqueue.
> >
> > Such a facility could be accessed by either transport specific method or admin
> > virtqueue, it totally depends on the hardware architecture of the vendor.
> >
> I find it hard to believe that a vendor can implement a CVQ but not AQ and chose to expose tens of hundreds of registers.
> But maybe, it fits some specific hw.

You can have a look at the ifcvf dpdk driver as an example.

But another thing that is unrelated to hardware architecture is the
nesting support. Having admin virtqueue in a nesting environment looks
like an overkill. Presenting a register in L1 and map it to L0's admin
should be good enough.

>
> I like to learn the advantages of such method other than simplicity.
>
> We can clearly that we are shifting away from such PCI registers with SIOV, IMS and other scalable solutions.
> virtio drifting in reverse direction by introducing more registers as transport.
> I expect it to an optional transport like AQ.

Actually, I had a proposal of using admin virtqueue as a transport,
it's designed to be SIOV/IMS capable. And it's not hard to extend it
with the state/stop support etc.

>
> > >
> > > Next would be to program hundreds of statistics of the 64 VQs through a
> > giant PCI config space register in some busy polling scheme.
> >
> > We don't need giant config space, and this method has been implemented
> > by some vDPA vendors.
> >
> There are tens of 64-bit counters per VQs. These needs to programmed on destination side.
> Programming these via registers requires exposing them on the registers.
> In one of the proposals, I see them being queried via CVQ from the device.

I didn't see a proposal like this. And I don't think querying general
virtio state like idx with a device specific CVQ is a good design.

>
> Programming them via cfg registers requires large cfg space or synchronous programming until receiving ACK from it.
> This means one entry at a time...
>
> Programming them via CVQ needs replicate and align cmd values etc on all device types. All duplicate and hard to maintain.
>
>
> > >
> > > I can clearly see how all these are inefficient for faster LM.
> > > We need an efficient AQ to proceed with at minimum.
> >
> > I'm fine with admin virtqueue, but the stop and state are orthogonal to that.
> > And using admin virtqueue for stop/state will be more natural if we use
> > admin virtqueue as a transport.
> Ok.
> We should have defined it bit earlier that all vendors can use. :(

I agree.

Thanks

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ