netdev - Re: [RFC PATCH v3 0/3] Enable virtio_net to act as a backup for a passthru device

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 20 Feb 2018 17:29:33 +0100
From:   Jiri Pirko <jiri@...nulli.us>
To:     Alexander Duyck <alexander.duyck@...il.com>
Cc:     Sridhar Samudrala <sridhar.samudrala@...el.com>,
        "Michael S. Tsirkin" <mst@...hat.com>,
        Stephen Hemminger <stephen@...workplumber.org>,
        David Miller <davem@...emloft.net>,
        Netdev <netdev@...r.kernel.org>,
        virtualization@...ts.linux-foundation.org,
        virtio-dev@...ts.oasis-open.org,
        "Brandeburg, Jesse" <jesse.brandeburg@...el.com>,
        "Duyck, Alexander H" <alexander.h.duyck@...el.com>,
        Jakub Kicinski <kubakici@...pl>,
        Jason Wang <jasowang@...hat.com>,
        Siwei Liu <loseweigh@...il.com>
Subject: Re: [RFC PATCH v3 0/3] Enable virtio_net to act as a backup for a
 passthru device

Tue, Feb 20, 2018 at 05:04:29PM CET, alexander.duyck@...il.com wrote:
>On Tue, Feb 20, 2018 at 2:42 AM, Jiri Pirko <jiri@...nulli.us> wrote:
>> Fri, Feb 16, 2018 at 07:11:19PM CET, sridhar.samudrala@...el.com wrote:
>>>Patch 1 introduces a new feature bit VIRTIO_NET_F_BACKUP that can be
>>>used by hypervisor to indicate that virtio_net interface should act as
>>>a backup for another device with the same MAC address.
>>>
>>>Ppatch 2 is in response to the community request for a 3 netdev
>>>solution.  However, it creates some issues we'll get into in a moment.
>>>It extends virtio_net to use alternate datapath when available and
>>>registered. When BACKUP feature is enabled, virtio_net driver creates
>>>an additional 'bypass' netdev that acts as a master device and controls
>>>2 slave devices.  The original virtio_net netdev is registered as
>>>'backup' netdev and a passthru/vf device with the same MAC gets
>>>registered as 'active' netdev. Both 'bypass' and 'backup' netdevs are
>>>associated with the same 'pci' device.  The user accesses the network
>>>interface via 'bypass' netdev. The 'bypass' netdev chooses 'active' netdev
>>>as default for transmits when it is available with link up and running.
>>
>> Sorry, but this is ridiculous. You are apparently re-implemeting part
>> of bonding driver as a part of NIC driver. Bond and team drivers
>> are mature solutions, well tested, broadly used, with lots of issues
>> resolved in the past. What you try to introduce is a weird shortcut
>> that already has couple of issues as you mentioned and will certanly
>> have many more. Also, I'm pretty sure that in future, someone comes up
>> with ideas like multiple VFs, LACP and similar bonding things.
>
>The problem with the bond and team drivers is they are too large and
>have too many interfaces available for configuration so as a result
>they can really screw this interface up.

What? Too large is which sense? Why "too many interfaces" is a problem?
Also, team has only one interface to userspace team-generic-netlink.


>
>Essentially this is meant to be a bond that is more-or-less managed by
>the host, not the guest. We want the host to be able to configure it

How is it managed by the host? In your usecase the guest has 2 netdevs:
virtio_net, pci vf.
I don't see how host can do any managing of that, other than the
obvious. But still, the active/backup decision is done in guest. This is
a simple bond/team usecase. As I said, there is something needed to be
implemented in userspace in order to handle re-appear of vf netdev.
But that should be fairly easy to do in teamd.


>and have it automatically kick in on the guest. For now we want to
>avoid adding too much complexity as this is meant to be just the first

That's what I fear, "for now"..


>step. Trying to go in and implement the whole solution right from the
>start based on existing drivers is going to be a massive time sink and
>will likely never get completed due to the fact that there is always
>going to be some other thing that will interfere.

"implement the whole solution right from the start based on existing
drivers" - what solution are you talking about? I don't understand this
para.


>
>My personal hope is that we can look at doing a virtio-bond sort of
>device that will handle all this as well as providing a communication
>channel, but that is much further down the road. For now we only have
>a single bit so the goal for now is trying to keep this as simple as
>possible.

Oh. So there is really intention to do re-implementation of bonding
in virtio. That is plain-wrong in my opinion.

Could you just use bond/team, please, and don't reinvent the wheel with
this abomination?


>
>> What is the reason for this abomination? According to:
>> https://marc.info/?l=linux-virtualization&m=151189725224231&w=2
>> The reason is quite weak.
>> User in the vm sees 2 (or more) netdevices, he puts them in bond/team
>> and that's it. This works now! If the vm lacks some userspace features,
>> let's fix it there! For example the MAC changes is something that could
>> be easily handled in teamd userspace deamon.
>
>I think you might have missed the point of this. This is meant to be a
>simple interface so the guest should not be able to change the MAC
>address, and it shouldn't require any userspace daemon to setup or
>tear down. Ideally with this solution the virtio bypass will come up
>and be assigned the name of the original virtio, and the "backup"
>interface will come up and be assigned the name of the original virtio
>with an additional "nbackup" tacked on via the phys_port_name, and
>then whenever a VF is added it will automatically be enslaved by the
>bypass interface, and it will be removed when the VF is hotplugged
>out.
>
>In my mind the difference between this and bond or team is where the
>configuration interface lies. In the case of bond it is in the kernel.
>If my understanding is correct team is mostly in user space. With this
>the configuration interface is really down in the hypervisor and
>requests are communicated up to the guest. I would prefer not to make
>virtio_net dependent on the bonding or team drivers, or worse yet a
>userspace daemon in the guest. For now I would argue we should keep
>this as simple as possible just to support basic live migration. There
>has already been discussions of refactoring this after it is in so
>that we can start to combine the functionality here with what is there
>in bonding/team, but the differences in configuration interface and
>the size of the code bases will make it challenging to outright merge
>this into something like that.