[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20220211140620.6a4fc338@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>
Date: Fri, 11 Feb 2022 14:06:20 -0800
From: Jakub Kicinski <kuba@...nel.org>
To: Xin Long <lucien.xin@...il.com>
Cc: Eric Dumazet <edumazet@...gle.com>,
network dev <netdev@...r.kernel.org>,
davem <davem@...emloft.net>,
Ziyang Xuan <william.xuanziyang@...wei.com>
Subject: Re: [PATCH net 2/2] vlan: move dev_put into vlan_dev_uninit
On Fri, 11 Feb 2022 15:58:56 +0800 Xin Long wrote:
> On Thu, Feb 10, 2022 at 1:59 PM Jakub Kicinski <kuba@...nel.org> wrote:
> > > This is doable, and risky ;)
> > >
> > > BTW, I have the plan of generalizing blackhole_netdev for IPv6,
> > > meaning that we could perhaps get rid of the dependency
> > > about loopback dev, being the last device in a netns being dismantled.
> >
> > Oh, I see..
> >
> > I have no great ideas then, we may need to go back to zeroing
> > vlan->real_dev and making sure the caller can deal with that.
> > At least for the time being. Xin this was discussed at some
> > length in response to the patch under Fixes.
>
> What if dev->real_dev is freed and zeroed *after* vlan_dev_real_dev()
> is called? This issue can still be triggered, right? I don't see any lock
> protecting this.
Maybe the suggestion in the old thread was to NULL the pointer out
before unregister is called. Which seems like a bad idea, as the
netdev would already be impaired when unregister is called.
> > Feels like sooner or later we'll run into a scenario when reversing will
> > cause a problem. Or some data structure will stop preserving the order.
> I was checking a few places doing such batch devices freeing, and noticed that:
> In rtnl_group_dellink() and __rtnl_kill_links(), it's using for_each_netdev(),
> while in default_device_exit_batch(), it's using for_each_netdev_reverse().
>
> shouldn't be in the same order all these places? If yes, which one is the
> right one to use?
I don't know. Maybe this will work maybe it will cause a circular
dependency with something else.
Honestly, I don't have a simple solutions to offer. Jann Horn pointed
out recently that our per-CPU netdev refs are themselves racy...
Powered by blists - more mailing lists