lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180608161801.64afda65@xeon-e3>
Date:   Fri, 8 Jun 2018 16:18:01 -0700
From:   Stephen Hemminger <stephen@...workplumber.org>
To:     Siwei Liu <loseweigh@...il.com>
Cc:     "Michael S. Tsirkin" <mst@...hat.com>,
        Jiri Pirko <jiri@...nulli.us>, kys@...rosoft.com,
        haiyangz@...rosoft.com, David Miller <davem@...emloft.net>,
        "Samudrala, Sridhar" <sridhar.samudrala@...el.com>,
        Netdev <netdev@...r.kernel.org>,
        Stephen Hemminger <sthemmin@...rosoft.com>
Subject: Re: [PATCH net] failover: eliminate callback hell

On Fri, 8 Jun 2018 15:25:59 -0700
Siwei Liu <loseweigh@...il.com> wrote:

> On Wed, Jun 6, 2018 at 2:24 PM, Stephen Hemminger
> <stephen@...workplumber.org> wrote:
> > On Wed, 6 Jun 2018 15:30:27 +0300
> > "Michael S. Tsirkin" <mst@...hat.com> wrote:
> >  
> >> On Wed, Jun 06, 2018 at 09:25:12AM +0200, Jiri Pirko wrote:  
> >> > Tue, Jun 05, 2018 at 05:42:31AM CEST, stephen@...workplumber.org wrote:  
> >> > >The net failover should be a simple library, not a virtual
> >> > >object with function callbacks (see callback hell).  
> >> >
> >> > Why just a library? It should do a common things. I think it should be a
> >> > virtual object. Looks like your patch again splits the common
> >> > functionality into multiple drivers. That is kind of backwards attitude.
> >> > I don't get it. We should rather focus on fixing the mess the
> >> > introduction of netvsc-bonding caused and switch netvsc to 3-netdev
> >> > model.  
> >>
> >> So it seems that at least one benefit for netvsc would be better
> >> handling of renames.
> >>
> >> Question is how can this change to 3-netdev happen?  Stephen is
> >> concerned about risk of breaking some userspace.
> >>
> >> Stephen, this seems to be the usecase that IFF_HIDDEN was trying to
> >> address, and you said then "why not use existing network namespaces
> >> rather than inventing a new abstraction". So how about it then? Do you
> >> want to find a way to use namespaces to hide the PV device for netvsc
> >> compatibility?
> >>  
> >
> > Netvsc can't work with 3 dev model. MS has worked with enough distro's and
> > startups that all demand eth0 always be present. And VF may come and go.
> > After this history, there is a strong motivation not to change how kernel
> > behaves. Switching to 3 device model would be perceived as breaking
> > existing userspace.
> >
> > With virtio you can  work it out with the distro's yourself.
> > There is no pre-existing semantics to deal with.
> >
> > For the virtio, I don't see the need for IFF_HIDDEN.  
> 
> I have a somewhat different view regarding IFF_HIDDEN. The purpose of
> that flag, as well as the 1-netdev model, is to have a means to
> inherit the interface name from the VF, and to eliminate playing hacks
> around renaming devices, customizing udev rules and et al. Why
> inheriting VF's name important? To allow existing config/setup around
> VF continues to work across kernel feature upgrade. Most of network
> config files in all distros are based on interface names. Few are MAC
> address based but making lower slaves hidden would cover the rest. And
> most importantly, preserving the same level of user experience as
> using raw VF interface once getting all ndo_ops and ethtool_ops
> exposed. This is essential to realize transparent live migration that
> users dont have to learn and be aware of the undertaken.

Inheriting the VF name will fail in the migration scenario.
It is perfectly reasonable to migrate a guest to another machine where
the VF PCI address is different. And since current udev/systemd model
is to base network device name off of PCI address, the device will change
name when guest is migrated.

On Azure, the VF maybe removed (by host) at any time and then later
reattached. There is no guarantee that VF will show back up at
the same synthetic PCI address. It will likely have a different
PCI domain value.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ