[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <146eabfe-123c-4970-901e-e961b4c09bc3@nvidia.com>
Date: Thu, 16 Jan 2025 00:16:27 +0200
From: Yael Chemla <ychemla@...dia.com>
To: Kuniyuki Iwashima <kuniyu@...zon.com>,
"David S. Miller" <davem@...emloft.net>, Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Simon Horman <horms@...nel.org>
Cc: Kuniyuki Iwashima <kuni1840@...il.com>, netdev@...r.kernel.org
Subject: Re: [PATCH v1 net-next 4/4] net: Hold rtnl_net_lock() in
(un)?register_netdevice_notifier_dev_net().
we observed in our regression tests the following issue:
BUG: KASAN: slab-use-after-free in notifier_call_chain+0x22c/0x280
kasan_report+0xbd/0xf0
RIP: 0033:0x7f70839018b7
kasan_save_stack+0x1c/0x40
kasan_save_track+0x10/0x30
__kasan_kmalloc+0x83/0x90
kasan_save_stack+0x1c/0x40
kasan_save_track+0x10/0x30
kasan_save_free_info+0x37/0x50
__kasan_slab_free+0x33/0x40
page dumped because: kasan: bad access detected
BUG: KASAN: slab-use-after-free in notifier_call_chain+0x222/0x280
kasan_report+0xbd/0xf0
RIP: 0033:0x7f70839018b7
kasan_save_stack+0x1c/0x40
kasan_save_track+0x10/0x30
__kasan_kmalloc+0x83/0x90
kasan_save_stack+0x1c/0x40
kasan_save_track+0x10/0x30
kasan_save_free_info+0x37/0x50
__kasan_slab_free+0x33/0x40
page dumped because: kasan: bad access detected
and there are many more of that kind.
it happens after applying commit 7fb1073300a2 ("net: Hold
rtnl_net_lock() in (un)?register_netdevice_notifier_dev_net()")
test scenario includes configuration and traffic over two namespaces
associated with two different VFs.
On 04/01/2025 8:37, Kuniyuki Iwashima wrote:
> (un)?register_netdevice_notifier_dev_net() hold RTNL before triggering
> the notifier for all netdev in the netns.
>
> Let's convert the RTNL to rtnl_net_lock().
>
> Note that move_netdevice_notifiers_dev_net() is assumed to be (but not
> yet) protected by per-netns RTNL of both src and dst netns; we need to
> convert wireless and hyperv drivers that call dev_change_net_namespace().
>
> Signed-off-by: Kuniyuki Iwashima <kuniyu@...zon.com>
> ---
> net/core/dev.c | 16 ++++++++++------
> 1 file changed, 10 insertions(+), 6 deletions(-)
>
> diff --git a/net/core/dev.c b/net/core/dev.c
> index f6c6559e2548..a0dd34463901 100644
> --- a/net/core/dev.c
> +++ b/net/core/dev.c
> @@ -1943,15 +1943,17 @@ int register_netdevice_notifier_dev_net(struct net_device *dev,
> struct notifier_block *nb,
> struct netdev_net_notifier *nn)
> {
> + struct net *net = dev_net(dev);
it seems to happen since the net pointer is acquired here without a lock.
Note that KASAN issue is not triggered when executing with rtnl_lock()
taken before this line. and our kernel .config expands
rtnl_net_lock(net) to rtnl_lock() (CONFIG_DEBUG_NET_SMALL_RTNL is not set).
> int err;
>
> - rtnl_lock();
> - err = __register_netdevice_notifier_net(dev_net(dev), nb, false);
> + rtnl_net_lock(net);
> + err = __register_netdevice_notifier_net(net, nb, false);
> if (!err) {
> nn->nb = nb;
> list_add(&nn->list, &dev->net_notifier_list);
> }
> - rtnl_unlock();
> + rtnl_net_unlock(net);
> +
> return err;
> }
> EXPORT_SYMBOL(register_netdevice_notifier_dev_net);
> @@ -1960,12 +1962,14 @@ int unregister_netdevice_notifier_dev_net(struct net_device *dev,
> struct notifier_block *nb,
> struct netdev_net_notifier *nn)
> {
> + struct net *net = dev_net(dev);
> int err;
>
> - rtnl_lock();
> + rtnl_net_lock(net);
> list_del(&nn->list);
> - err = __unregister_netdevice_notifier_net(dev_net(dev), nb);
> - rtnl_unlock();
> + err = __unregister_netdevice_notifier_net(net, nb);
> + rtnl_net_unlock(net);
> +
> return err;
> }
> EXPORT_SYMBOL(unregister_netdevice_notifier_dev_net);
Powered by blists - more mailing lists