netdev - Re: [virtio-dev] Re: [Qemu-devel] [PATCH] qemu: Introduce VIRTIO_NET_F_STANDBY feature bit to virtio

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180615134815.6613620e.cohuck@redhat.com>
Date:   Fri, 15 Jun 2018 13:48:15 +0200
From:   Cornelia Huck <cohuck@...hat.com>
To:     Siwei Liu <loseweigh@...il.com>
Cc:     "Samudrala, Sridhar" <sridhar.samudrala@...el.com>,
        Alexander Duyck <alexander.h.duyck@...el.com>,
        virtio-dev@...ts.oasis-open.org, aaron.f.brown@...el.com,
        Jiri Pirko <jiri@...nulli.us>,
        "Michael S. Tsirkin" <mst@...hat.com>,
        Jakub Kicinski <kubakici@...pl>,
        Netdev <netdev@...r.kernel.org>, qemu-devel@...gnu.org,
        virtualization@...ts.linux-foundation.org
Subject: Re: [virtio-dev] Re: [Qemu-devel] [PATCH] qemu: Introduce
 VIRTIO_NET_F_STANDBY feature bit to virtio_net

On Thu, 14 Jun 2018 18:57:11 -0700
Siwei Liu <loseweigh@...il.com> wrote:

> Thank you for sharing your thoughts, Cornelia. With questions below, I
> think you raised really good points, some of which I don't have answer
> yet and would also like to explore here.
> 
> First off, I don't want to push the discussion to the extreme at this
> point, or sell anything about having QEMU manage everything
> automatically. Don't get me wrong, it's not there yet. Let's don't
> assume we are tied to a specific or concerte solution. I think the key
> for our discussion might be to define or refine the boundary between
> VM and guest,  e.g. what each layer is expected to control and manage
> exactly.
> 
> In my view, there might be possibly 3 different options to represent
> the failover device conceipt to QEMU and libvirt (or any upper layer
> software):
> 
> a. Seperate device: in this model, virtio and passthough remains
> separate devices just as today. QEMU exposes the standby feature bit
> for virtio, and publish status/event around the negotiation process of
> this feature bit for libvirt to react upon. Since Libvirt has the
> pairing relationship itself, maybe through MAC address or something
> else, it can control the presence of primary by hot plugging or
> unplugging the passthrough device, although it has to work tightly
> with virtio's feature negotation process. Not just for migration but
> also various corner scenarios (driver/feature ok, device reset,
> reboot, legacy guest etc) along virtio's feature negotiation.

Yes, that one has obvious tie-ins to virtio's modus operandi.

> 
> b. Coupled device: in this model, virtio and passthough devices are
> weakly coupled using some group ID, i.e. QEMU match the passthough
> device for a standby virtio instance by comparing the group ID value
> present behind each device's bridge. Libvirt provides QEMU the group
> ID for both type of devices, and only deals with hot plug for
> migration, by checking some migration status exposed (e.g. the feature
> negotiation status on the virtio device) by QEMU. QEMU manages the
> visibility of the primary in guest along virtio's feature negotiation
> process.

I'm a bit confused here. What, exactly, ties the two devices together?
If libvirt already has the knowledge that it should manage the two as a
couple, why do we need the group id (or something else for other
architectures)? (Maybe I'm simply missing something because I'm not
that familiar with pci.)

> 
> c. Fully combined device: in this model, virtio and passthough devices
> are viewed as a single VM interface altogther. QEMU not just controls
> the visibility of the primary in guest, but can also manage the
> exposure of the passthrough for migratability. It can be like that
> libvirt supplies the group ID to QEMU. Or libvirt does not even have
> to provide group ID for grouping the two devices, if just one single
> combined device is exposed by QEMU. In either case, QEMU manages all
> aspect of such internal construct, including virtio feature
> negotiation, presence of the primary, and live migration.

Same question as above.

> 
> It looks like to me that, in your opinion, you seem to prefer go with
> (a). While I'm actually okay with either (b) or (c). Do I understand
> your point correctly?

I'm not yet preferring anything, as I'm still trying to understand how
this works :) I hope we can arrive at a model that covers the use case
and that is also flexible enough to be extended to other platforms.

> 
> The reason that I feel that (a) might not be ideal, just as Michael
> alluded to (quoting below), is that as management stack, it really
> doesn't need to care about the detailed process of feature negotiation
> (if we view the guest presence of the primary as part of feature
> negotiation at an extended level not just virtio). All it needs to be
> done is to hand in the required devices to QEMU and that's all. Why do
> we need to addd various hooks, events for whichever happens internally
> within the guest?
> 
> ''
> Primary device is added with a special "primary-failover" flag.
> A virtual machine is then initialized with just a standby virtio
> device. Primary is not yet added.
> 
> Later QEMU detects that guest driver device set DRIVER_OK.
> It then exposes the primary device to the guest, and triggers
> a device addition event (hot-plug event) for it.
> 
> If QEMU detects guest driver removal, it initiates a hot-unplug sequence
> to remove the primary driver.  In particular, if QEMU detects guest
> re-initialization (e.g. by detecting guest reset) it immediately removes
> the primary device.
> ''
> 
> and,
> 
> ''
> management just wants to give the primary to guest and later take it back,
> it really does not care about the details of the process,
> so I don't see what does pushing it up the stack buy you.
> 
> So I don't think it *needs* to be done in libvirt. It probably can if you
> add a bunch of hooks so it knows whenever vm reboots, driver binds and
> unbinds from device, and can check that backup flag was set.
> If you are pushing for a setup like that please get a buy-in
> from libvirt maintainers or better write a patch.
> ''

This actually seems to mean the opposite to me: We need to know what
the guest is doing and when, as it directly drives what we need to do
with the devices. If we switch to a visibility vs a hotplug model (see
the other mail), we might be able to handle that part within qemu.
However, I don't see how you get around needing libvirt to actually set
this up in the first place and to handle migration per se.