lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <KU1P153MB012097D6AA971EC957D854B2BF2B0@KU1P153MB0120.APCP153.PROD.OUTLOOK.COM>
Date:   Sun, 6 Sep 2020 03:05:48 +0000
From:   Dexuan Cui <decui@...rosoft.com>
To:     Jakub Kicinski <kuba@...nel.org>
CC:     "wei.liu@...nel.org" <wei.liu@...nel.org>,
        KY Srinivasan <kys@...rosoft.com>,
        Haiyang Zhang <haiyangz@...rosoft.com>,
        Stephen Hemminger <sthemmin@...rosoft.com>,
        "davem@...emloft.net" <davem@...emloft.net>,
        "linux-hyperv@...r.kernel.org" <linux-hyperv@...r.kernel.org>,
        "netdev@...r.kernel.org" <netdev@...r.kernel.org>,
        "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
        Michael Kelley <mikelley@...rosoft.com>
Subject: RE: [PATCH net] hv_netvsc: Fix hibernation for mlx5 VF driver

> From: Jakub Kicinski <kuba@...nel.org>
> Sent: Saturday, September 5, 2020 4:27 PM
> [...]
> On Fri,  4 Sep 2020 19:52:18 -0700 Dexuan Cui wrote:
> > mlx5_suspend()/resume() keep the network interface, so during hibernation
> > netvsc_unregister_vf() and netvsc_register_vf() are not called, and hence
> > netvsc_resume() should call netvsc_vf_changed() to switch the data path
> > back to the VF after hibernation.
> 
> Does suspending the system automatically switch back to the synthetic
> datapath? 
Yes. 

For mlx4, since the VF network interafce is explicitly destroyed and re-created
during hibernation (i.e. suspend + resume), hv_netvsc explicitly switches the
data path from and to the VF.

For mlx5, the VF network interface persists across hibernation, so there is no
explicit switch-over, but after we close and re-open the vmbus channel of
the netvsc NIC in netvsc_suspend() and netvsc_resume(), the data path is
implicitly switched to the netvsc NIC, and with this patch netvsc_resume() ->
netvsc_vf_changed() switches the data path back to the mlx5 NIC.

> Please clarify this in the commit message and/or add a code
> comment.
I will add a comment in the commit message and the code.
 
> > @@ -2587,7 +2587,7 @@ static int netvsc_remove(struct hv_device *dev)
> >  static int netvsc_suspend(struct hv_device *dev)
> >  {
> >  	struct net_device_context *ndev_ctx;
> > -	struct net_device *vf_netdev, *net;
> > +	struct net_device *net;
> >  	struct netvsc_device *nvdev;
> >  	int ret;
> 
> Please keep reverse xmas tree variable ordering.

Will do.

> > @@ -2635,6 +2632,10 @@ static int netvsc_resume(struct hv_device *dev)
> >  	netvsc_devinfo_put(device_info);
> >  	net_device_ctx->saved_netvsc_dev_info = NULL;
> >
> > +	vf_netdev = rtnl_dereference(net_device_ctx->vf_netdev);
> > +	if (vf_netdev && netvsc_vf_changed(vf_netdev) != NOTIFY_OK)
> > +		ret = -EINVAL;
> 
> Should you perhaps remove the VF in case of the failure?
IMO this failure actually should not happen since we're resuming the netvsc
NIC, so we're sure we have a valid pointer to the netvsc net device, and
netvsc_vf_changed() should be able to find the netvsc pointer and return
NOTIFY_OK. In case of a failure, something really bad must be happening,
and I'm not sure if it's safe to simply remove the VF, so I just return
-EINVAL for simplicity, since I believe the failure should not happen in practice.

I would rather keep the code as-is, but I'm OK to add a WARN_ON(1) if you
think that's necessary.

Thanks,
-- Dexuan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ