netdev - Re: [virtio-dev] Re: [RFC PATCH net-next v2 2/2] virtio_net: Extend virtio to use VF datapath when available

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKgT0UfEHc6T4MbAwFjx4c2TK1hL-_eGDpSiBEY_ryN_4UaPoQ@mail.gmail.com>
Date:   Sun, 28 Jan 2018 20:26:53 -0800
From:   Alexander Duyck <alexander.duyck@...il.com>
To:     Stephen Hemminger <stephen@...workplumber.org>
Cc:     Jakub Kicinski <kubakici@...pl>,
        "Samudrala, Sridhar" <sridhar.samudrala@...el.com>,
        "Michael S. Tsirkin" <mst@...hat.com>,
        Siwei Liu <loseweigh@...il.com>,
        David Miller <davem@...emloft.net>,
        Netdev <netdev@...r.kernel.org>,
        virtualization@...ts.linux-foundation.org,
        virtio-dev@...ts.oasis-open.org,
        "Brandeburg, Jesse" <jesse.brandeburg@...el.com>,
        Alexander Duyck <alexander.h.duyck@...el.com>
Subject: Re: [virtio-dev] Re: [RFC PATCH net-next v2 2/2] virtio_net: Extend
 virtio to use VF datapath when available

On Sun, Jan 28, 2018 at 3:02 PM, Stephen Hemminger
<stephen@...workplumber.org> wrote:
> On Fri, 26 Jan 2018 18:30:03 -0800
> Jakub Kicinski <kubakici@...pl> wrote:
>
>> On Fri, 26 Jan 2018 15:30:35 -0800, Samudrala, Sridhar wrote:
>> > On 1/26/2018 2:47 PM, Jakub Kicinski wrote:
>> > > On Sat, 27 Jan 2018 00:14:20 +0200, Michael S. Tsirkin wrote:
>> > >> On Fri, Jan 26, 2018 at 01:46:42PM -0800, Siwei Liu wrote:
>> > >>>> and the VM is not expected to do any tuning/optimizations on the VF driver
>> > >>>> directly,
>> > >>>> i think the current patch that follows the netvsc model of 2 netdevs(virtio
>> > >>>> and vf) should
>> > >>>> work fine.
>> > >>> OK. For your use case that's fine. But that's too specific scenario
>> > >>> with lots of restrictions IMHO, perhaps very few users will benefit
>> > >>> from it, I'm not sure. If you're unwilling to move towards it, we'd
>> > >>> take this one and come back with a generic solution that is able to
>> > >>> address general use cases for VF/PT live migration .
>> > >> I think that's a fine approach. Scratch your own itch!  I imagine a very
>> > >> generic virtio-switchdev providing host routing info to guests could
>> > >> address lots of usecases. A driver could bind to that one and enslave
>> > >> arbitrary other devices.  Sounds reasonable.
>> > >>
>> > >> But given the fundamental idea of a failover was floated at least as
>> > >> early as 2013, and made 0 progress since precisely because it kept
>> > >> trying to address more and more features, and given netvsc is already
>> > >> using the basic solution with some success, I'm not inclined to block
>> > >> this specific effort waiting for the generic one.
>> > > I think there is an agreement that the extra netdev will be useful for
>> > > more advanced use cases, and is generally preferable.  What is the
>> > > argument for not doing that from the start?  If it was made I must have
>> > > missed it.  Is it just unwillingness to write the extra 300 lines of
>> > > code?  Sounds like a pretty weak argument when adding kernel ABI is at
>> > > stake...
>> >
>> > I am still not clear on the need for the extra netdev created by
>> > virtio_net. The only advantage i can see is that the stats can be
>> > broken between VF and virtio datapaths compared to the aggregrated
>> > stats on virtio netdev as seen with the 2 netdev approach.
>>
>> Maybe you're not convinced but multiple arguments were made.
>>
>> > With 2 netdev model, any VM image that has a working network
>> > configuration will transparently get VF based acceleration without
>> > any changes.
>>
>> Nothing happens transparently.  Things may happen automatically.  The
>> VF netdev doesn't disappear with netvsc.  The PV netdev transforms into
>> something it did not use to be.  And configures and reports some
>> information from the PV (e.g. speed) but PV doesn't pass traffic any
>> longer.
>>
>> > 3 netdev model breaks this configuration starting with the creation
>> > and naming of the 2 devices to udev needing to be aware of master and
>> > slave virtio-net devices.
>>
>> I don't understand this comment.  There is one virtio-net device and
>> one "virtio-bond" netdev.  And user space has to be aware of the special
>> automatic arrangement anyway, because it can't touch the VF.  It
>> doesn't make any difference whether it ignores the VF or PV and VF.
>> It simply can't touch the slaves, no matter how many there are.
>>
>> > Also, from a user experience point of view, loading a virtio-net with
>> > BACKUP feature enabled will now show 2 virtio-net netdevs.
>>
>> One virtio-net and one virtio-bond, which represents what's happening.
>>
>> > For live migration with advanced usecases that Siwei is suggesting, i
>> > think we need a new driver with a new device type that can track the
>> > VF specific feature settings even when the VF driver is unloaded.
>
> I see no added value of the 3 netdev model, there is no need for a bond
> device.

I agree a full-blown bond isn't what is needed. However, just forking
traffic out from virtio to a VF doesn't really solve things either.

One of the issues as I see it is the fact that the qdisc model with
the merged device gets to be pretty ugly. There is the fact that every
packet that goes to the VF has to pass through the qdisc code twice.
There is the dual nature of the 2 netdev solution that also introduces
the potential for head-of-line blocking since the virtio could put
back pressure on the upper qdisc layer which could stall the VF
traffic when switching over. I hope we could avoid issues like that by
maintaining qdiscs per device queue instead of on an upper device that
is half software interface and half not. Ideally the virtio-bond
device could operate without a qdisc and without needing any
additional locks so there shouldn't be head of line blocking occurring
between the two interfaces and overhead could be kept minimal.

Also in the case of virtio there is support for in-driver XDP. As
Sridhar stated, when using the 2 netdev model "we cannot support XDP
in this model and it needs to be disabled". That sounds like a step
backwards instead of forwards. I would much rather leave the XDP
enabled at the lower dev level, and then if we want we can use the
generic XDP at the virtio-bond level to capture traffic on both
interfaces at the same time.

In the case of netvsc you have control of both sides of a given link
so you can match up the RSS tables, queue configuration and everything
is somewhat symmetric since you are running the PF and all the HyperV
subchannels. Most of the complexity is pushed down into the host and
your subchannel management is synchronized there if I am not mistaken.
We don't have this in the case of this virtio-bond setup. Instead a
single bit is set indicating "backup" without indicating what that
means to topology other than the fact that this virtio interface is
the backup for some other interface. We are essentially blind other
than having the link status for the VF and virtio and knowing that the
virtio is intended to be the backup.

We might be able to get to a 2 or maybe even a 1 netdev solution at
some point in the future, but  I don't think that time is now. For now
a virtio-bond type solution would allow us to address most of the use
cases with minimal modification to the existing virtio and can deal
with feature and/or resource asymmetry.

- Alex