netdev - Re: [virtio-dev] Re: [Qemu-devel] [PATCH] qemu: Introduce VIRTIO_NET_F_STANDBY feature bit to virtio

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADGSJ22h0Ypn=TzHcxWHt5fF3Xi=J2H5AesYyCWFAiHr+vXEJA@mail.gmail.com>
Date:   Wed, 20 Jun 2018 12:59:26 -0700
From:   Siwei Liu <loseweigh@...il.com>
To:     Cornelia Huck <cohuck@...hat.com>
Cc:     "Samudrala, Sridhar" <sridhar.samudrala@...el.com>,
        Alexander Duyck <alexander.h.duyck@...el.com>,
        virtio-dev@...ts.oasis-open.org, aaron.f.brown@...el.com,
        Jiri Pirko <jiri@...nulli.us>,
        "Michael S. Tsirkin" <mst@...hat.com>,
        Jakub Kicinski <kubakici@...pl>,
        Netdev <netdev@...r.kernel.org>, qemu-devel@...gnu.org,
        virtualization@...ts.linux-foundation.org, konrad.wilk@...cle.com,
        boris.ostrovsky@...cle.com,
        Joao Martins <joao.m.martins@...cle.com>,
        Venu Busireddy <venu.busireddy@...cle.com>,
        vijay.balakrishna@...cle.com
Subject: Re: [virtio-dev] Re: [Qemu-devel] [PATCH] qemu: Introduce
 VIRTIO_NET_F_STANDBY feature bit to virtio_net

On Wed, Jun 20, 2018 at 7:34 AM, Cornelia Huck <cohuck@...hat.com> wrote:
> On Tue, 19 Jun 2018 13:09:14 -0700
> Siwei Liu <loseweigh@...il.com> wrote:
>
>> On Tue, Jun 19, 2018 at 3:54 AM, Cornelia Huck <cohuck@...hat.com> wrote:
>> > On Fri, 15 Jun 2018 10:06:07 -0700
>> > Siwei Liu <loseweigh@...il.com> wrote:
>> >
>> >> On Fri, Jun 15, 2018 at 4:48 AM, Cornelia Huck <cohuck@...hat.com> wrote:
>> >> > On Thu, 14 Jun 2018 18:57:11 -0700
>> >> > Siwei Liu <loseweigh@...il.com> wrote:
>
>> >> > I'm a bit confused here. What, exactly, ties the two devices together?
>> >>
>> >> The group UUID. Since QEMU VFIO dvice does not have insight of MAC
>> >> address (which it doesn't have to), the association between VFIO
>> >> passthrough and standby must be specificed for QEMU to understand the
>> >> relationship with this model. Note, standby feature is no longer
>> >> required to be exposed under this model.
>> >
>> > Isn't that a bit limiting, though?
>> >
>> > With this model, you can probably tie a vfio-pci device and a
>> > virtio-net-pci device together. But this will fail if you have
>> > different transports: Consider tying together a vfio-pci device and a
>> > virtio-net-ccw device on s390, for example. The standby feature bit is
>> > on the virtio-net level and should not have any dependency on the
>> > transport used.
>>
>> Probably we'd limit the support for grouping to virtio-net-pci device
>> and vfio-pci device only. For virtio-net-pci, as you might see with
>> Venu's patch, we store the group UUID on the config space of
>> virtio-pci, which is only applicable to PCI transport.
>>
>> If virtio-net-ccw needs to support the same, I think similar grouping
>> interface should be defined on the VirtIO CCW transport. I think the
>> current implementation of the Linux failover driver assumes that it's
>> SR-IOV VF with same MAC address which the virtio-net-pci needs to pair
>> with, and that the PV path is on same PF without needing to update
>> network of the port-MAC association change. If we need to extend the
>> grouping mechanism to virtio-net-ccw, it has to pass such failover
>> mode to virtio driver specifically through some other option I guess.
>
> Hm, I've just spent some time reading the Linux failover code and I did
> not really find much pci-related magic in there (other than checking
> for a pci device in net_failover_slave_pre_register). We also seem to
> look for a matching device by MAC only. What magic am I missing?

The existing assumptions around SR-IOV VF and thus PCI is implicit. A
lot of simplications are built on the fact that the passthrough device
is a SR-IOV Virtual Function specifically than others: MAC addresses
for couple devices must be the same, changing MAC address is
prohibited, programming VLAN filter is challenged, the datapath of
virtio-net has to share the same physical function where VF belongs
to. There's no hankshake during datapath switching at all to support a
normal passthrough device at this point. I'd imagine some work around
that ahead, which might be a bit involved than just to support a
simplified model for VF migration.

>
> Is the look-for-uuid handling supposed to happen in the host only?

The look-for-MAC matching scheme is not ideal in many aspects. I don't
want to repeat those again, but once the group UUID is added to QEMU,
the failover driver is supposed to switch to the UUID based matching
scheme in the guest.

>
>> >> > If libvirt already has the knowledge that it should manage the two as a
>> >> > couple, why do we need the group id (or something else for other
>> >> > architectures)? (Maybe I'm simply missing something because I'm not
>> >> > that familiar with pci.)
>> >>
>> >> The idea is to have QEMU control the visibility and enumeration order
>> >> of the passthrough VFIO for the failover scenario. Hotplug can be one
>> >> way to achieve it, and perhaps there's other way around also. The
>> >> group ID is not just for QEMU to couple devices, it's also helpful to
>> >> guest too as grouping using MAC address is just not safe.
>> >
>> > Sorry about dragging mainframes into this, but this will only work for
>> > homogenous device coupling, not for heterogenous. Consider my vfio-pci
>> > + virtio-net-ccw example again: The guest cannot find out that the two
>> > belong together by checking some group ID, it has to either use the MAC
>> > or some needs-to-be-architectured property.
>> >
>> > Alternatively, we could propose that mechanism as pci-only, which means
>> > we can rely on mechanisms that won't necessarily work on non-pci
>> > transports. (FWIW, I don't see a use case for using vfio-ccw to pass
>> > through a network card anytime in the near future, due to the nature of
>> > network cards currently in use on s390.)
>>
>> Yes, let's do this just for PCI transport (homogenous) for now.
>
> But why? Using pci for passthrough to make things easier (and because
> there's not really a use case), sure. But I really don't want to
> restrict this to virtio-pci only.

Of course, technically it doesn't have to be virtio-pci only. The
group UUID can even extend it further to non-pci transport. However,
with the current focus of the driver support on SR-IOV VF and limited
use case on non-pci, I'd feel no immediate effort will be needed on
that front.

>
>> >> In the model of (b), I think it essentially turns hotplug to one of
>> >> mechanisms for QEMU to control the visibility. The libvirt can still
>> >> manage the hotplug of individual devices during live migration or in
>> >> normal situation to hot add/remove devices. Though the visibility of
>> >> the VFIO is under the controll of QEMU, and it's possible that the hot
>> >> add/remove request does not involve actual hot plug activity in guest
>> >> at all.
>> >
>> > That depends on how you model visibility, I guess. You'll probably want
>> > to stop traffic flowing through one or the other of the cards; would
>> > link down or similar be enough for the virtio device?
>>
>> I'm not sure if it is a good idea. The guest user will see two devices
>> with same MAC but one of them is down. Do you expect user to use it or
>> not? And since the guest is going to be migrated, we need to unplug a
>> broken VF from guest before migrating, why do we bother plugging in
>> this useless VF at the first place?
>
> I was thinking about using hotunplugging only over migration and doing
> the link up only after feature negotiation has finished, but that is
> probably too complicated. Let's stick to hotplug for simplicity's sake.

OK. Thanks for the discussion, it's really useful.

Regards,
-Siwei