[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20220325144839.7110fc8d@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
Date: Fri, 25 Mar 2022 14:48:39 -0700
From: Jakub Kicinski <kuba@...nel.org>
To: Johannes Berg <johannes@...solutions.net>
Cc: William McVicker <willmcvicker@...gle.com>,
linux-wireless@...r.kernel.org,
Marek Szyprowski <m.szyprowski@...sung.com>,
Kalle Valo <kvalo@...eaurora.org>,
"David S. Miller" <davem@...emloft.net>, netdev@...r.kernel.org,
Amitkumar Karwar <amitkarwar@...il.com>,
Xinming Hu <huxinming820@...il.com>, kernel-team@...roid.com,
Paolo Abeni <pabeni@...hat.com>,
Eric Dumazet <edumazet@...gle.com>,
Cong Wang <cwang@...pensource.com>,
Cong Wang <xiyou.wangcong@...il.com>,
"Eric W. Biederman" <ebiederm@...ssion.com>
Subject: Re: [BUG] deadlock in nl80211_vendor_cmd
On Fri, 25 Mar 2022 22:25:05 +0100 Johannes Berg wrote:
> > > With some suitable commentary, that might also be a reasonable thing?
> > > __rtnl_unlock() is actually rather pretty rare, and not exported.
> >
> > The main use for it seems to be re-locking before loading a module,
> > which TBH I have no idea why, is it just a cargo cult or a historical
> > thing :S I don't see how letting netdevs leave before _loading_
> > a module makes any difference whatsoever.
>
> Indeed.
>
> > The WARN_ON() you suggested up front make perfect sense to me.
> > You can also take the definition of net_unlink_todo() out of
> > netdevice.h while at it because o_0
>
> Heh indeed, what?
To be clear - I just meant that it's declaring a static variable in
a header, so I doubt that it'll do the right thing unless it's only
called from one compilation unit.
> But (and now I'll CC even more people...) if we can actually have an
> invariant that while RTNL is unlocked the todo list is empty, then we
> also don't need rtnl_lock_unregistering_all(), and can remove the
> netdev_unregistering_wq, etc., no?
>
> IOW, I'm not sure why we needed commit 50624c934db1 ("net: Delay
> default_device_exit_batch until no devices are unregistering v2"), but I
> also have little doubt that we did.
>
> Ah, no. This isn't about locking in this case, it's literally about
> ensuring that free_netdev() has been called in netdev_run_todo()?
Yup, multiple contexts sitting independently in netdev_run_todo() and
chewing on netdevs is slightly different. destructors of those netdevs
could be pointing at memory of a module being unloaded.
> Which we don't care about in cfg80211 - we just care about the list
> being empty so there's no chance we'll reacquire the RTNL.
Powered by blists - more mailing lists