[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4678fc51-a402-d3ea-e875-6eba175933ba@oracle.com>
Date: Sat, 20 Aug 2022 01:55:36 -0700
From: Si-Wei Liu <si-wei.liu@...cle.com>
To: Jason Wang <jasowang@...hat.com>
Cc: "Michael S. Tsirkin" <mst@...hat.com>,
"Zhu, Lingshan" <lingshan.zhu@...el.com>,
virtualization <virtualization@...ts.linux-foundation.org>,
netdev <netdev@...r.kernel.org>, kvm <kvm@...r.kernel.org>,
Parav Pandit <parav@...dia.com>,
Yongji Xie <xieyongji@...edance.com>,
"Dawar, Gautam" <gautam.dawar@....com>
Subject: Re: [PATCH 2/2] vDPA: conditionally read fields in virtio-net dev
On 8/18/2022 5:42 PM, Jason Wang wrote:
> On Fri, Aug 19, 2022 at 7:20 AM Si-Wei Liu <si-wei.liu@...cle.com> wrote:
>>
>>
>> On 8/17/2022 9:15 PM, Jason Wang wrote:
>>> 在 2022/8/17 18:37, Michael S. Tsirkin 写道:
>>>> On Wed, Aug 17, 2022 at 05:43:22PM +0800, Zhu, Lingshan wrote:
>>>>> On 8/17/2022 5:39 PM, Michael S. Tsirkin wrote:
>>>>>> On Wed, Aug 17, 2022 at 05:13:59PM +0800, Zhu, Lingshan wrote:
>>>>>>> On 8/17/2022 4:55 PM, Michael S. Tsirkin wrote:
>>>>>>>> On Wed, Aug 17, 2022 at 10:14:26AM +0800, Zhu, Lingshan wrote:
>>>>>>>>> Yes it is a little messy, and we can not check _F_VERSION_1
>>>>>>>>> because of
>>>>>>>>> transitional devices, so maybe this is the best we can do for now
>>>>>>>> I think vhost generally needs an API to declare config space
>>>>>>>> endian-ness
>>>>>>>> to kernel. vdpa can reuse that too then.
>>>>>>> Yes, I remember you have mentioned some IOCTL to set the endian-ness,
>>>>>>> for vDPA, I think only the vendor driver knows the endian,
>>>>>>> so we may need a new function vdpa_ops->get_endian().
>>>>>>> In the last thread, we say maybe it's better to add a comment for
>>>>>>> now.
>>>>>>> But if you think we should add a vdpa_ops->get_endian(), I can work
>>>>>>> on it for sure!
>>>>>>>
>>>>>>> Thanks
>>>>>>> Zhu Lingshan
>>>>>> I think QEMU has to set endian-ness. No one else knows.
>>>>> Yes, for SW based vhost it is true. But for HW vDPA, only
>>>>> the device & driver knows the endian, I think we can not
>>>>> "set" a hardware's endian.
>>>> QEMU knows the guest endian-ness and it knows that
>>>> device is accessed through the legacy interface.
>>>> It can accordingly send endian-ness to the kernel and
>>>> kernel can propagate it to the driver.
>>>
>>> I wonder if we can simply force LE and then Qemu can do the endian
>>> conversion?
>> convert from LE for config space fields only, or QEMU has to forcefully
>> mediate and covert endianness for all device memory access including
>> even the datapath (fields in descriptor and avail/used rings)?
> Former. Actually, I want to force modern devices for vDPA when
> developing the vDPA framework. But then we see requirements for
> transitional or even legacy (e.g the Ali ENI parent). So it
> complicates things a lot.
>
> I think several ideas has been proposed:
>
> 1) Your proposal of having a vDPA specific way for
> modern/transitional/legacy awareness. This seems very clean since each
> transport should have the ability to do that but it still requires
> some kind of mediation for the case e.g running BE legacy guest on LE
> host.
In theory it seems like so, though practically I wonder if we can just
forbid BE legacy driver from running on modern LE host. For those who
care about legacy BE guest, they mostly like could and should talk to
vendor to get native BE support to achieve hardware acceleration, few of
them would count on QEMU in mediating or emulating the datapath
(otherwise I don't see the benefit of adopting vDPA?). I still feel that
not every hardware vendor has to offer backward compatibility
(transitional device) with legacy interface/behavior (BE being just
one), this is unlike the situation on software virtio device, which has
legacy support since day one. I think we ever discussed it before: for
those vDPA vendors who don't offer legacy guest support, maybe we should
mandate some feature for e.g. VERSION_1, as these devices really don't
offer functionality of the opposite side (!VERSION_1) during negotiation.
Having it said, perhaps we should also allow vendor device to implement
only partial support for legacy. We can define "reversed" backend
feature to denote some part of the legacy interface/functionality not
getting implemented by device. For instance,
VHOST_BACKEND_F_NO_BE_VRING, VHOST_BACKEND_F_NO_BE_CONFIG,
VHOST_BACKEND_F_NO_ALIGNED_VRING, VHOST_BACKEND_NET_F_NO_WRITEABLE_MAC,
and et al. Not all of these missing features for legacy would be easy
for QEMU to make up for, so QEMU can selectively emulate those at its
best when necessary and applicable. In other word, this design shouldn't
prevent QEMU from making up for vendor device's partial legacy support.
>
> 2) Michael suggests using VHOST_SET_VRING_ENDIAN where it means we
> need a new config ops for vDPA bus, but it doesn't solve the issue for
> config space (at least from its name). We probably need a new ioctl
> for both vring and config space.
Yep adding a new ioctl makes things better, but I think the key is not
the new ioctl. It's whether or not we should enforce every vDPA vendor
driver to implement all transitional interfaces to be spec compliant. If
we allow them to reject the VHOST_SET_VRING_ENDIAN or
VHOST_SET_CONFIG_ENDIAN call, what could we do? We would still end up
with same situation of either fail the guest, or trying to
mediate/emulate, right?
Not to mention VHOST_SET_VRING_ENDIAN is rarely supported by vhost today
- few distro kernel has CONFIG_VHOST_CROSS_ENDIAN_LEGACY enabled and
QEMU just ignores the result. vhost doesn't necessarily depend on it to
determine endianness it looks.
>
> or
>
> 3) revisit the idea of forcing modern only device which may simplify
> things a lot
I am not actually against forcing modern only config space, given that
it's not hard for either QEMU or individual driver to mediate or
emulate, and for the most part it's not conflict with the goal of
offload or acceleration with vDPA. But forcing LE ring layout IMO would
just kill off the potential of a very good use case. Currently for our
use case the priority for supporting 0.9.5 guest with vDPA is slightly
lower compared to live migration, but it is still in our TODO list.
Thanks,
-Siwei
>
> which way should we go?
>
>> I hope
>> it's not the latter, otherwise it loses the point to use vDPA for
>> datapath acceleration.
>>
>> Even if its the former, it's a little weird for vendor device to
>> implement a LE config space with BE ring layout, although still possible...
> Right.
>
> Thanks
>
>> -Siwei
>>> Thanks
>>>
>>>
>>>>> So if you think we should add a vdpa_ops->get_endian(),
>>>>> I will drop these comments in the next version of
>>>>> series, and work on a new patch for get_endian().
>>>>>
>>>>> Thanks,
>>>>> Zhu Lingshan
>>>> Guests don't get endian-ness from devices so this seems pointless.
>>>>
Powered by blists - more mailing lists