netdev - Re: [PATCH v1 net 1/2] net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CANn89iJzav0za=tJq7MvXpEXYNFY_+1D6==w2jbKd0-0mhKO4g@mail.gmail.com>
Date: Fri, 31 Jan 2025 05:33:49 +0100
From: Eric Dumazet <edumazet@...gle.com>
To: Kuniyuki Iwashima <kuniyu@...zon.com>
Cc: "David S. Miller" <davem@...emloft.net>, Jakub Kicinski <kuba@...nel.org>, 
	Paolo Abeni <pabeni@...hat.com>, Simon Horman <horms@...nel.org>, 
	Kuniyuki Iwashima <kuni1840@...il.com>, netdev@...r.kernel.org, 
	Yael Chemla <ychemla@...dia.com>
Subject: Re: [PATCH v1 net 1/2] net: Fix dev_net(dev) race in unregister_netdevice_notifier_dev_net().

On Fri, Jan 31, 2025 at 12:25 AM Kuniyuki Iwashima <kuniyu@...zon.com> wrote:
>
> After the cited commit, dev_net(dev) is fetched before holding RTNL
> and passed to __unregister_netdevice_notifier_net().
>
> However, dev_net(dev) might be different after holding RTNL.
>
> In the reported case [0], while removing a VF device, its netns was
> being dismantled and the VF was moved to init_net.
>
> So the following sequence is basically illegal when dev was fetched
> without lookup:
>
>   net = dev_net(dev);
>   rtnl_net_lock(net);
>
> Let's use a new helper rtnl_net_dev_lock() to fix the race.
>
> It calls maybe_get_net() for dev_net(dev) and checks dev_net(dev)
> before/after rtnl_net_lock().
>
> The dev_net(dev) pointer itself is valid, thanks to RCU API, but the

I am pretty sure dev_net(net) is not always called under rcu_read_lock().

And RTNL would not help in the future either.

> netns might be being dismantled.  maybe_get_net() is to avoid the race.
> This can be done by holding pernet_ops_rwsem, but it will be overkill.
>
> [0]:

> Fixes: 7fb1073300a2 ("net: Hold rtnl_net_lock() in (un)?register_netdevice_notifier_dev_net().")
> Reported-by: Yael Chemla <ychemla@...dia.com>
> Closes: https://lore.kernel.org/netdev/146eabfe-123c-4970-901e-e961b4c09bc3@nvidia.com/
> Signed-off-by: Kuniyuki Iwashima <kuniyu@...zon.com>
> Tested-by: Yael Chemla <ychemla@...dia.com>
> ---
>  net/core/dev.c | 59 +++++++++++++++++++++++++++++++++++++++-----------
>  1 file changed, 46 insertions(+), 13 deletions(-)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index c0021cbd28fc..f91ddb7f8bdf 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -2070,6 +2070,47 @@ static void __move_netdevice_notifier_net(struct net *src_net,
>         __register_netdevice_notifier_net(dst_net, nb, true);
>  }
>
> +static bool from_cleanup_net(void)
> +{
> +#ifdef CONFIG_NET_NS
> +       return current == cleanup_net_task;
> +#else
> +       return false;
> +#endif
> +}
> +
> +static void rtnl_net_dev_lock(struct net_device *dev)
> +{
> +       struct net *net;
> +
> +       DEBUG_NET_WARN_ON_ONCE(from_cleanup_net());
> +again:
> +       /* netns might be being dismantled. */

rcu_read_lock();

> +       net = maybe_get_net(dev_net(dev));

rcu_read_unlock();



> +       if (!net) {
> +               cond_resched();

cond_resched() can be a NOP on some kernel builds.

This loop might burn a lot of cpu cycles.

Perhaps msleep(1) instead.

> +               goto again;
> +       }
> +


> +       rtnl_net_lock(net);
> +
> +       /* dev might have been moved to another netns. */
> +       if (!net_eq(net, dev_net(dev))) {
> +               rtnl_net_unlock(net);
> +               put_net(net);
> +               cond_resched();

      This cond_resched() seems unnecessary, the net change can not
occur more than once per rcu grace period (an eternity)

> +               goto again;
> +       }
> +}