[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <MN2PR21MB1375505C4C5E132B161FB4BFCA260@MN2PR21MB1375.namprd21.prod.outlook.com>
Date: Tue, 31 Dec 2019 17:51:43 +0000
From: Haiyang Zhang <haiyangz@...rosoft.com>
To: Stephen Hemminger <stephen@...workplumber.org>
CC: Roman Kagan <rkagan@...tuozzo.com>,
"sashal@...nel.org" <sashal@...nel.org>,
"linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
"netdev@...r.kernel.org" <netdev@...r.kernel.org>,
KY Srinivasan <kys@...rosoft.com>,
Stephen Hemminger <sthemmin@...rosoft.com>,
"olaf@...fle.de" <olaf@...fle.de>, vkuznets <vkuznets@...hat.com>,
"davem@...emloft.net" <davem@...emloft.net>
Subject: RE: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus
offer sequence and use async probe
> -----Original Message-----
> From: linux-hyperv-owner@...r.kernel.org <linux-hyperv-
> owner@...r.kernel.org> On Behalf Of Stephen Hemminger
> Sent: Tuesday, December 31, 2019 12:36 PM
> To: Haiyang Zhang <haiyangz@...rosoft.com>
> Cc: Roman Kagan <rkagan@...tuozzo.com>; sashal@...nel.org; linux-
> hyperv@...r.kernel.org; netdev@...r.kernel.org; KY Srinivasan
> <kys@...rosoft.com>; Stephen Hemminger <sthemmin@...rosoft.com>;
> olaf@...fle.de; vkuznets <vkuznets@...hat.com>; davem@...emloft.net
> Subject: Re: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on vmbus
> offer sequence and use async probe
>
> On Tue, 31 Dec 2019 16:12:36 +0000
> Haiyang Zhang <haiyangz@...rosoft.com> wrote:
>
> > > -----Original Message-----
> > > From: Roman Kagan <rkagan@...tuozzo.com>
> > > Sent: Tuesday, December 31, 2019 6:35 AM
> > > To: Haiyang Zhang <haiyangz@...rosoft.com>
> > > Cc: sashal@...nel.org; linux-hyperv@...r.kernel.org;
> netdev@...r.kernel.org;
> > > KY Srinivasan <kys@...rosoft.com>; Stephen Hemminger
> > > <sthemmin@...rosoft.com>; olaf@...fle.de; vkuznets
> > > <vkuznets@...hat.com>; davem@...emloft.net; linux-
> kernel@...r.kernel.org
> > > Subject: Re: [PATCH V2,net-next, 3/3] hv_netvsc: Name NICs based on
> vmbus
> > > offer sequence and use async probe
> > >
> > > On Mon, Dec 30, 2019 at 12:13:34PM -0800, Haiyang Zhang wrote:
> > > > The dev_num field in vmbus channel structure is assigned to the first
> > > > available number when the channel is offered. So netvsc driver uses it
> > > > for NIC naming based on channel offer sequence. Now re-enable the
> > > > async probing mode for faster probing.
> > > >
> > > > Signed-off-by: Haiyang Zhang <haiyangz@...rosoft.com>
> > > > ---
> > > > drivers/net/hyperv/netvsc_drv.c | 18 +++++++++++++++---
> > > > 1 file changed, 15 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/drivers/net/hyperv/netvsc_drv.c
> > > > b/drivers/net/hyperv/netvsc_drv.c index f3f9eb8..39c412f 100644
> > > > --- a/drivers/net/hyperv/netvsc_drv.c
> > > > +++ b/drivers/net/hyperv/netvsc_drv.c
> > > > @@ -2267,10 +2267,14 @@ static int netvsc_probe(struct hv_device *dev,
> > > > struct net_device_context *net_device_ctx;
> > > > struct netvsc_device_info *device_info = NULL;
> > > > struct netvsc_device *nvdev;
> > > > + char name[IFNAMSIZ];
> > > > int ret = -ENOMEM;
> > > >
> > > > - net = alloc_etherdev_mq(sizeof(struct net_device_context),
> > > > - VRSS_CHANNEL_MAX);
> > > > + snprintf(name, IFNAMSIZ, "eth%d", dev->channel->dev_num);
> > >
> > > How is this supposed to work when there are other ethernet device types on
> the
> > > system, which may claim the same device names?
> > >
> > > > + net = alloc_netdev_mqs(sizeof(struct net_device_context), name,
> > > > + NET_NAME_ENUM, ether_setup,
> > > > + VRSS_CHANNEL_MAX, VRSS_CHANNEL_MAX);
> > > > +
> > > > if (!net)
> > > > goto no_net;
> > > >
> > > > @@ -2355,6 +2359,14 @@ static int netvsc_probe(struct hv_device *dev,
> > > > net->max_mtu = ETH_DATA_LEN;
> > > >
> > > > ret = register_netdevice(net);
> > > > +
> > > > + if (ret == -EEXIST) {
> > > > + pr_info("NIC name %s exists, request another name.\n",
> > > > + net->name);
> > > > + strlcpy(net->name, "eth%d", IFNAMSIZ);
> > > > + ret = register_netdevice(net);
> > > > + }
> > >
> > > IOW you want the device naming to be predictable, but don't guarantee this?
> > >
> > > I think the problem this patchset is trying to solve is much better solved with
> a
> > > udev rule, similar to how it's done for PCI net devices.
> > > And IMO the primary channel number, being a device's "hardware"
> > > property, is more suited to be used in the device name, than this completely
> > > ephemeral device number.
> >
> > The vmbus number can be affected by other types of devices and/or
> subchannel
> > offerings. They are not stable either. That's why this patch set keeps track of
> the
> > offering sequence within the same device type in a new variable "dev_num".
> >
> > As in my earlier email, to avoid impact by other types of NICs, we should put
> them
> > into different naming formats, like "vf*", "enP*", etc. And yes, these can be
> done in
> > udev.
> >
> > But for netvsc (synthetic) NICs, we still want the default naming format "eth*".
> And
> > the variable "dev_num" gives them the basis for stable naming with Async
> probing.
> >
> > Thanks,
> > - Haiyang
> >
>
> The primary requirements for network naming are:
> 1. Network names must be repeatable on each boot. This was the original
> problem
> that PCI devices discovered back years ago when parallel probing was
> enabled.
> 2. Network names must be predictable. If new VM is created, the names
> should
> match a similar VM config.
> 3. Names must be persistent. If a NIC is added or deleted, the existing names
> must not change.
>
> The other things which are important (and this proposal breaks):
> 1. Don't break it principle: an existing VM should not suddenly get interfaces
> renamed if kernel is upgraded. A corrallary is that a lot of current userspace
> code expects eth0. It doesn't look like first interface would be guaranteed
> to be eth0.
>
> 2. No snowflakes principle: a device driver should follow the current practice
> of other devices. For netvsc, this means VMBus should act like PCI as much
> as possible. Is there another driver doing this already?
>
> 3. Userspace policy principle: Every distribution has its own policy by now.
> The solution must make netvsc work reliably on Redhat (udev), Ubuntu
> (netplan), SuSE (Yast)
> doing something in the kernel violates #2.
>
> My recommendation would be to take a multi-phase approach:
> 1. Expose persistent value in sysfs now.
> 2. Work with udev/netplan/... to use that value.
> 3. Make parallel VMBus probing an option. So that when distributions have
> picked up
> the udev changes they can enable parallel probe. Some will be quick to
> adopt
> and the enterprise laggards can get to it when they feel the heat.
>
> Long term wish list (requires host side changes):
> 1. The interface index could be a host side property; the host networking
> already has the virtual device table and it is persistent.
> 2. The Azure NIC name should be visible as a property in guest.
> Then userspace could do rename based on that property.
> Having multiple disconnected names is leads to confusion.
Thank you for the detailed recommendations!
I will do the step 1. Expose persistent value in sysfs now.
And work on other changes in the future.
Thanks,
- Haiyang
Powered by blists - more mailing lists