lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180307221001-mutt-send-email-mst@kernel.org>
Date:   Wed, 7 Mar 2018 22:11:52 +0200
From:   "Michael S. Tsirkin" <mst@...hat.com>
To:     Stephen Hemminger <stephen@...workplumber.org>
Cc:     Alexander Duyck <alexander.duyck@...il.com>,
        Jiri Pirko <jiri@...nulli.us>,
        Sridhar Samudrala <sridhar.samudrala@...el.com>,
        David Miller <davem@...emloft.net>,
        Netdev <netdev@...r.kernel.org>, virtio-dev@...ts.oasis-open.org,
        "Brandeburg, Jesse" <jesse.brandeburg@...el.com>,
        "Duyck, Alexander H" <alexander.h.duyck@...el.com>,
        Jakub Kicinski <kubakici@...pl>
Subject: Re: [PATCH v4 2/2] virtio_net: Extend virtio to use VF datapath when
 available

On Wed, Mar 07, 2018 at 10:06:30AM -0800, Stephen Hemminger wrote:
> On Wed, 7 Mar 2018 09:50:50 -0800
> Alexander Duyck <alexander.duyck@...il.com> wrote:
> 
> > On Tue, Mar 6, 2018 at 6:38 PM, Michael S. Tsirkin <mst@...hat.com> wrote:
> > > On Tue, Mar 06, 2018 at 03:27:46PM -0800, Alexander Duyck wrote:  
> > >> > I definitelly vote for a separate common shared code for both netvsc and
> > >> > virtio_net - even if you use 2 and 3 netdev model, you could share the
> > >> > common code. Strict checks and limitation should be in place.  
> > >>
> > >> Noted. But as I also mentioned there isn't that much "common" code
> > >> between the two models. I think if anything we could probably look at
> > >> peeling out a few bits such as "get_<iface>_bymac" which really would
> > >> become dev_get_by_mac_and_ops in order to find the device for the
> > >> notifiers. I probably wouldn't even put that in our driver and would
> > >> instead put it in the core code since it almost makes more sense
> > >> there. Beyond that sharing becomes much more challenging due to the
> > >> differences in the Rx and Tx paths that build out of the difference
> > >> between the 2 driver and 3 driver models.  
> > >
> > > At this point it might be worth it to articulate the advantages
> > > of the 3 netdev model.  
> > 
> > The main advantages are performance on the lower devices. Specifically
> > we can run LLTX and IFF_NOQUEUE on the upper device, meaning the VF
> > doesn't take a performance hit when we start transmitting through it.
> > In addition the paravirtual device is able to fully expose it's
> > features, so for example virtio_net can maintain in-driver XDP, and we
> > can provide generic XDP on the upper device. We can also have an
> > asymmetric configuration where the number of queues or feature sets
> > can be different between the ports.
> > 
> > Basically what it comes down to is in the 3 netdev model we never have
> > to deal with the issues of going from being a standard netdev to being
> > a master netdev. The role of each netdev is clearly defined.
> > 
> > As far as doing a generic solution it is the preferred way to go as
> > well since we don't have to rewrite functionality of the lower device.
> > Currently the only changes really needed are to add a phys_port_name
> > operation so that you can distinguish between the bypass interface and
> > the original paravirtual one. As such you have no confusion about what
> > you are running.
> > 
> > > If they are compelling, why wouldn't netvsc users want them?  
> > 
> > I believe part of the logic for Stephen's choice is that netvsc
> > doesn't have a "BACKUP" bit like what we have virtio_net. As a result
> > they have no way of knowing if a VF will ever show up or not. That
> > makes the 3 netdev solution less appealing as they always end up in
> > the bonding mode. In addition they have intentionally limited
> > themselves to avoid features such as XDP that would cause issues with
> > the 2 netdev model.
> > 
> > > Alex, I think you were one of the strongest proponents of this model,
> > > you should be well placed to provide a summary.  
> > 
> > Hopefully the information provided helps. In my mind the issue is that
> > netvsc was not designed correctly, and as such it is using the 2
> > netdev model as a bit of a crutch to handle the case where they wanted
> > to add a VF bypass. At this point it has become a part of their
> > architecture so it would be confusing for their user base to deal with
> > having 2 netdevs spawn every time their driver sees a new device. In
> > the case of virtio we only spawn 2 netdevs if the backup bit is set
> > which implies a specific use case. When the bit isn't set we are only
> > spawning one device, and as a result we can get much better
> > performance out of the resultant combination of devices, and can
> > expose functionality such as in-driver XDP in virtio.
> 
> 
> I have experimented with toggling IFF_NOQUEUE and/or LLTX on the netvsc
> device when doing passthrough. It didn't help performance much and there
> are corner cases that make it painful, like what if qdisc was placed
> by user on the netvsc device?  Should the qdisc then be moved back
> and forth to the VF device when it arrives or is removed?
> 
> As far as XDP support, it is on the plan to support XDP on the netvsc
> device since the receive buffers do have to be copied. But there has
> been no customer demand for it; and the VF model on Azure has limitations
> which make a transparent XDP model pretty much impossible.


Jiri, does that answer your question? Even though there is some
similarity, the needs of netvsc and virtio appear significantly
different.

We do not yet know whether other PV devices will be virtio-like
or netvsc-like.

Do you agree?

-- 
MST

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ