lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20191231093614.75da9bea@hermes.lan>
Date:   Tue, 31 Dec 2019 09:36:14 -0800
From:   Stephen Hemminger <stephen@...workplumber.org>
To:     Haiyang Zhang <haiyangz@...rosoft.com>
Cc:     Roman Kagan <rkagan@...tuozzo.com>,
        "sashal@...nel.org" <sashal@...nel.org>,
        "linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        KY Srinivasan <kys@...rosoft.com>,
        Stephen Hemminger <sthemmin@...rosoft.com>,
        "olaf@...fle.de" <olaf@...fle.de>, vkuznets <vkuznets@...hat.com>,
        "davem@...emloft.net" <davem@...emloft.net>
Subject: Re: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus
 offer sequence and use async probe

On Tue, 31 Dec 2019 16:12:36 +0000
Haiyang Zhang <haiyangz@...rosoft.com> wrote:

> > -----Original Message-----
> > From: Roman Kagan <rkagan@...tuozzo.com>
> > Sent: Tuesday, December 31, 2019 6:35 AM
> > To: Haiyang Zhang <haiyangz@...rosoft.com>
> > Cc: sashal@...nel.org; linux-hyperv@...r.kernel.org; netdev@...r.kernel.org;
> > KY Srinivasan <kys@...rosoft.com>; Stephen Hemminger
> > <sthemmin@...rosoft.com>; olaf@...fle.de; vkuznets
> > <vkuznets@...hat.com>; davem@...emloft.net; linux-kernel@...r.kernel.org
> > Subject: Re: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus
> > offer sequence and use async probe
> > 
> > On Mon, Dec 30, 2019 at 12:13:34PM -0800, Haiyang Zhang wrote:  
> > > The dev_num field in vmbus channel structure is assigned to the first
> > > available number when the channel is offered. So netvsc driver uses it
> > > for NIC naming based on channel offer sequence. Now re-enable the
> > > async probing mode for faster probing.
> > >
> > > Signed-off-by: Haiyang Zhang <haiyangz@...rosoft.com>
> > > ---
> > >  drivers/net/hyperv/netvsc_drv.c | 18 +++++++++++++++---
> > >  1 file changed, 15 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/net/hyperv/netvsc_drv.c
> > > b/drivers/net/hyperv/netvsc_drv.c index f3f9eb8..39c412f 100644
> > > --- a/drivers/net/hyperv/netvsc_drv.c
> > > +++ b/drivers/net/hyperv/netvsc_drv.c
> > > @@ -2267,10 +2267,14 @@ static int netvsc_probe(struct hv_device *dev,
> > >  	struct net_device_context *net_device_ctx;
> > >  	struct netvsc_device_info *device_info = NULL;
> > >  	struct netvsc_device *nvdev;
> > > +	char name[IFNAMSIZ];
> > >  	int ret = -ENOMEM;
> > >
> > > -	net = alloc_etherdev_mq(sizeof(struct net_device_context),
> > > -				VRSS_CHANNEL_MAX);
> > > +	snprintf(name, IFNAMSIZ, "eth%d", dev->channel->dev_num);  
> > 
> > How is this supposed to work when there are other ethernet device types on the
> > system, which may claim the same device names?
> >   
> > > +	net = alloc_netdev_mqs(sizeof(struct net_device_context), name,
> > > +			       NET_NAME_ENUM, ether_setup,
> > > +			       VRSS_CHANNEL_MAX, VRSS_CHANNEL_MAX);
> > > +
> > >  	if (!net)
> > >  		goto no_net;
> > >
> > > @@ -2355,6 +2359,14 @@ static int netvsc_probe(struct hv_device *dev,
> > >  		net->max_mtu = ETH_DATA_LEN;
> > >
> > >  	ret = register_netdevice(net);
> > > +
> > > +	if (ret == -EEXIST) {
> > > +		pr_info("NIC name %s exists, request another name.\n",
> > > +			net->name);
> > > +		strlcpy(net->name, "eth%d", IFNAMSIZ);
> > > +		ret = register_netdevice(net);
> > > +	}  
> > 
> > IOW you want the device naming to be predictable, but don't guarantee this?
> > 
> > I think the problem this patchset is trying to solve is much better solved with a
> > udev rule, similar to how it's done for PCI net devices.
> > And IMO the primary channel number, being a device's "hardware"
> > property, is more suited to be used in the device name, than this completely
> > ephemeral device number.  
> 
> The vmbus number can be affected by other types of devices and/or subchannel
> offerings. They are not stable either. That's why this patch set keeps track of the 
> offering sequence within the same device type in a new variable "dev_num".
> 
> As in my earlier email, to avoid impact by other types of NICs, we should put them
> into different naming formats, like "vf*", "enP*", etc. And yes, these can be done in
> udev.
> 
> But for netvsc (synthetic) NICs, we still want the default naming format "eth*". And
> the variable "dev_num" gives them the basis for stable naming with Async probing.
> 
> Thanks,
> - Haiyang
> 

The primary requirements for network naming are:
  1. Network names must be repeatable on each boot. This was the original problem
     that PCI devices discovered back years ago when parallel probing was enabled.
  2. Network names must be predictable. If new VM is created, the names should
     match a similar VM config.
  3. Names must be persistent. If a NIC is added or deleted, the existing names
     must not change.

The other things which are important (and this proposal breaks):
  1. Don't break it principle: an existing VM should not suddenly get interfaces
     renamed if kernel is upgraded. A corrallary is that a lot of current userspace
     code expects eth0. It doesn't look like first interface would be guaranteed
     to be eth0.

  2. No snowflakes principle: a device driver should follow the current practice
     of other devices. For netvsc, this means VMBus should act like PCI as much
     as possible. Is there another driver doing this already?

  3. Userspace policy principle: Every distribution has its own policy by now.
     The solution must make netvsc work reliably on Redhat (udev), Ubuntu (netplan), SuSE (Yast)
     doing something in the kernel violates #2.

My recommendation would be to take a multi-phase approach:
  1. Expose persistent value in sysfs now.
  2. Work with udev/netplan/... to use that value. 
  3. Make parallel VMBus probing an option. So that when distributions have picked up
     the udev changes they can enable parallel probe. Some will be quick to adopt
     and the enterprise laggards can get to it when they feel the heat.

Long term wish list (requires host side changes):
   1. The interface index could be a host side property; the host networking
       already has the virtual device table and it is persistent.
   2. The Azure NIC name should be visible as a property in guest. 
      Then userspace could do rename based on that property.
      Having multiple disconnected names is leads to confusion.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ