lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20211109064921.6bd65f3c@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
Date:   Tue, 9 Nov 2021 06:49:21 -0800
From:   Jakub Kicinski <kuba@...nel.org>
To:     Leon Romanovsky <leon@...nel.org>
Cc:     Ido Schimmel <idosch@...sch.org>, Jiri Pirko <jiri@...nulli.us>,
        "David S . Miller" <davem@...emloft.net>,
        Jiri Pirko <jiri@...dia.com>, linux-kernel@...r.kernel.org,
        netdev@...r.kernel.org, edwin.peer@...adcom.com
Subject: Re: [PATCH net-next] devlink: Require devlink lock during device
 reload

On Tue, 9 Nov 2021 16:30:16 +0200 Leon Romanovsky wrote:
> On Tue, Nov 09, 2021 at 06:17:29AM -0800, Jakub Kicinski wrote:
> > On Tue, 9 Nov 2021 16:12:33 +0200 Leon Romanovsky wrote:  
> > > RDMA subsystem supports two net namespace aware scenarios.
> > > 
> > > We need global netdev_notifier for shared mode. This is legacy mode where
> > > we listen to all namespaces. We must support this mode otherwise we break
> > > whole RDMA world.
> > > 
> > > See commit below:
> > > de641d74fb00 ("Revert "RDMA/mlx5: Fix devlink deadlock on net namespace deletion"")  
> > 
> > But why re-reg? To take advantage of clean event replay?
> > 
> > IIUC the problem is that the un-reg is called from the reload_down path.  
> 
> This is how devlink reload was explained to me by Jiri. It suppose to
> unload and load whole driver again. In our case, it unloads mlx5_core,
> which destroys all ULPs (VDPA, RDMA, e.t.c) and these ULPs are not aware
> of devlink at all. Especially they are not aware of the reason of
> devlink reload.

Anything registering the global notifier outside of init_net needs
scrutiny. If the devlink instance was moved into namespace A then
paying attention to anything outside namespace A weakens the namespace
abstraction. init_net is legacy/special and it can't disappear so the
problem at hand won't happen (as it is triggered on namespace cleanup).

Maybe you can install the global notifier only if instance is in
init_net? That's my thinking with the scant information at hand.

> This is why I asked you. It is not related to devlink locking directly,
> but indirectly I didn't want to work on anything related to locking
> without making sure that devlink.c is correct.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ