linux-kernel - Re: [RFC PATCH] vdpa: mandate 1.0 device

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20210511061740-mutt-send-email-mst@kernel.org>
Date:   Wed, 12 May 2021 03:53:46 -0400
From:   "Michael S. Tsirkin" <mst@...hat.com>
To:     Jason Wang <jasowang@...hat.com>
Cc:     virtualization@...ts.linux-foundation.org,
        linux-kernel@...r.kernel.org, parav@...dia.com, elic@...dia.com,
        Lingshan Zhu <lingshan.zhu@...el.com>, mapfelba@...hat.com
Subject: Re: [RFC PATCH] vdpa: mandate 1.0 device

On Tue, May 11, 2021 at 04:43:13AM -0400, Jason Wang wrote:
> 
> 
> ----- 原始邮件 -----
> > 
> > 在 2021/4/21 下午4:03, Michael S. Tsirkin 写道:
> > > On Wed, Apr 21, 2021 at 03:41:36PM +0800, Jason Wang wrote:
> > >> 在 2021/4/12 下午5:23, Jason Wang 写道:
> > >>> 在 2021/4/12 下午5:09, Michael S. Tsirkin 写道:
> > >>>> On Mon, Apr 12, 2021 at 02:35:07PM +0800, Jason Wang wrote:
> > >>>>> 在 2021/4/10 上午12:04, Michael S. Tsirkin 写道:
> > >>>>>> On Fri, Apr 09, 2021 at 12:47:55PM +0800, Jason Wang wrote:
> > >>>>>>> 在 2021/4/8 下午11:59, Michael S. Tsirkin 写道:
> > >>>>>>>> On Thu, Apr 08, 2021 at 04:26:48PM +0800, Jason Wang wrote:
> > >>>>>>>>> This patch mandates 1.0 for vDPA devices. The goal is to have the
> > >>>>>>>>> semantic of normative statement in the virtio
> > >>>>>>>>> spec and eliminate the
> > >>>>>>>>> burden of transitional device for both vDPA bus and vDPA parent.
> > >>>>>>>>>
> > >>>>>>>>> uAPI seems fine since all the vDPA parent mandates
> > >>>>>>>>> VIRTIO_F_ACCESS_PLATFORM which implies 1.0 devices.
> > >>>>>>>>>
> > >>>>>>>>> For legacy guests, it can still work since Qemu will mediate when
> > >>>>>>>>> necessary (e.g doing the endian conversion).
> > >>>>>>>>>
> > >>>>>>>>> Signed-off-by: Jason Wang <jasowang@...hat.com>
> > >>>>>>>> Hmm. If we do this, don't we still have a problem with
> > >>>>>>>> legacy drivers which don't ack 1.0?
> > >>>>>>> Yes, but it's not something that is introduced in this
> > >>>>>>> commit. The legacy
> > >>>>>>> driver never work ...
> > >>>>>> My point is this neither fixes or prevents this.
> > >>>>>>
> > >>>>>> So my suggestion is to finally add ioctls along the lines
> > >>>>>> of PROTOCOL_FEATURES of vhost-user.
> > >>>>>>
> > >>>>>> Then that one can have bits for legacy le, legacy be and modern.
> > >>>>>>
> > >>>>>> BTW I looked at vhost-user and it does not look like that
> > >>>>>> has a solution for this problem either, right?
> > >>>>> Right.
> > >>>>>
> > >>>>>
> > >>>>>>>> Note 1.0 affects ring endianness which is not mediated in QEMU
> > >>>>>>>> so QEMU can't pretend to device guest is 1.0.
> > >>>>>>> Right, I plan to send patches to do mediation in the
> > >>>>>>> Qemu to unbreak legacy
> > >>>>>>> drivers.
> > >>>>>>>
> > >>>>>>> Thanks
> > >>>>>> I frankly think we'll need PROTOCOL_FEATURES anyway, it's
> > >>>>>> too useful ...
> > >>>>>> so why not teach drivers about it and be done with it? You
> > >>>>>> can't emulate
> > >>>>>> legacy on modern in a cross endian situation because of vring
> > >>>>>> endian-ness ...
> > >>>>> So the problem still. This can only work when the hardware can support
> > >>>>> legacy vring endian-ness.
> > >>>>>
> > >>>>> Consider:
> > >>>>>
> > >>>>> 1) the leagcy driver support is non-normative in the spec
> > >>>>> 2) support a transitional device in the kenrel may requires the
> > >>>>> hardware
> > >>>>> support and a burden of kernel codes
> > >>>>>
> > >>>>> I'd rather simply drop the legacy driver support
> > >>>> My point is this patch does not drop legacy support. It merely mandates
> > >>>> modern support.
> > >>>
> > >>> I am not sure I get here. This patch fails the set_feature if VERSION_1
> > >>> is not negotiated. This means:
> > >>>
> > >>> 1) vDPA presents a modern device instead of transitonal device
> > >>> 2) legacy driver can't be probed
> > >>>
> > >>> What I'm missing?
> > >>
> > >> Hi Michael:
> > >>
> > >> Do you agree to find the way to present modern device? We need a
> > >> conclusion
> > >> to make the netlink API work to move forward.
> > >>
> > >> Thanks
> > > I think we need a way to support legacy with no data path overhead. qemu
> > > setting VERSION_1 for a legacy guest affects the ring format so it does
> > > not really work. This seems to rule out emulating config space entirely
> > > in userspace.
> > 
> > 
> > So I'd rather drop the legacy support in this case. It never work for
> > vDPA in the past and virtio-vDPA doesn't even need that. Note that
> > ACCESS_PLATFORM is mandated for all the vDPA parents right now which
> > implies modern device and LE. I wonder what's the value for supporting
> > legacy in this case or do we really encourage vendors to ship card with
> > legacy support (e.g endian support in the hardware)?
> 
> Hi Michael:
> 
> Any thoughts on this approach?
> 
> My understanding is that dropping legacy support will simplify a lot of stuffs.
> 
> Thanks

So basically the main condition is that strong memory barriers aren't
needed for virtio, smp barriers are enough.
Are there architectures besides x86 (where it's kind of true - as long as
one does not use non-temporals) where that is true?
If all these architectures are LE then we don't need to worry
about endian support in the hardware.

In other words I guess yes we could have qemu limit things to x86 and
then just pretend to the card that it's virtio 1.
So endian-ness we can address.

Problem is virtio 1 has effects beyond this. things like header size
with mergeable buffers off for virtio net.

So I am inclined to say let us not do the "pretend it's virtio 1" game
in qemu.  Let us be honest to the card about what happens.
But if you want to limit things to x86 either in kernel or in qemu,
that's ok by me.

> 
> > 
> > 
> > >
> > > So I think we should add an ioctl along the lines of
> > > protocol features. Then I think we can reserve feature bits
> > > for config space format: legacy LE, legacy BE, modern.
> > 
> > 
> > We had VHOST_SET_VRING_ENDIAN but this will complicates both the vDPA
> > parent and management. What's more important, legacy behaviour is not
> > restrictied by the spec.
> > 
> > 
> > >
> > > Querying the feature bits will provide us with info about
> > > what does the device support. Acking them will tell device
> > > what does guest need.
> > 
> > 
> > I think this can work, but I wonder how much we can gain from such
> > complexitiy.
> > 
> > Thanks
> > 
> > 
> > >
> > >
> > >
> > >
> > >
> > >>>
> > >>>>> to have a simple and easy
> > >>>>> abstarction in the kenrel. For legacy driver in the guest,
> > >>>>> hypervisor is in
> > >>>>> charge of the mediation:
> > >>>>>
> > >>>>> 1) config space access endian conversion
> > >>>>> 2) using shadow virtqueue to change the endian in the vring
> > >>>>>
> > >>>>> Thanks
> > >>>> I'd like to avoid shadow virtqueue hacks if at all possible.
> > >>>> Last I checked performance wasn't much better than just emulating
> > >>>> virtio in software.
> > >>>
> > >>> I think the legacy driver support is just a nice to have. Or do you see
> > >>> any value to that? I guess for mellanox and intel, only modern device is
> > >>> supported in the hardware.
> > >>>
> > >>> Thanks
> > >>>
> > >>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>> ---
> > >>>>>>>>>      include/linux/vdpa.h | 6 ++++++
> > >>>>>>>>>      1 file changed, 6 insertions(+)
> > >>>>>>>>>
> > >>>>>>>>> diff --git a/include/linux/vdpa.h b/include/linux/vdpa.h
> > >>>>>>>>> index 0fefeb976877..cfde4ec999b4 100644
> > >>>>>>>>> --- a/include/linux/vdpa.h
> > >>>>>>>>> +++ b/include/linux/vdpa.h
> > >>>>>>>>> @@ -6,6 +6,7 @@
> > >>>>>>>>>      #include <linux/device.h>
> > >>>>>>>>>      #include <linux/interrupt.h>
> > >>>>>>>>>      #include <linux/vhost_iotlb.h>
> > >>>>>>>>> +#include <uapi/linux/virtio_config.h>
> > >>>>>>>>>      /**
> > >>>>>>>>>       * vDPA callback definition.
> > >>>>>>>>> @@ -317,6 +318,11 @@ static inline int
> > >>>>>>>>> vdpa_set_features(struct vdpa_device *vdev, u64
> > >>>>>>>>> features)
> > >>>>>>>>>      {
> > >>>>>>>>>              const struct vdpa_config_ops *ops = vdev->config;
> > >>>>>>>>> +        /* Mandating 1.0 to have semantics of
> > >>>>>>>>> normative statements in
> > >>>>>>>>> +         * the spec. */
> > >>>>>>>>> +        if (!(features & BIT_ULL(VIRTIO_F_VERSION_1)))
> > >>>>>>>>> +        return -EINVAL;
> > >>>>>>>>> +
> > >>>>>>>>>          vdev->features_valid = true;
> > >>>>>>>>>              return ops->set_features(vdev, features);
> > >>>>>>>>>      }
> > >>>>>>>>> --
> > >>>>>>>>> 2.25.1
> > 
> >