[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <232327.1715777067@vermin>
Date: Wed, 15 May 2024 14:44:27 +0200
From: Jay Vosburgh <jay.vosburgh@...onical.com>
To: Tony Battersby <tonyb@...ernetics.com>
cc: "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>, Jakub Kicinski <kuba@...nel.org>,
Paolo Abeni <pabeni@...hat.com>,
Andy Gospodarek <andy@...yhouse.net>,
Zhengchao Shao <shaozhengchao@...wei.com>, netdev@...r.kernel.org
Subject: Re: [PATCH net] bonding: fix oops during rmmod
Tony Battersby <tonyb@...ernetics.com> wrote:
>"rmmod bonding" causes an oops ever since commit cc317ea3d927 ("bonding:
>remove redundant NULL check in debugfs function"). Here are the relevant
>functions being called:
>
>bonding_exit()
> bond_destroy_debugfs()
> debugfs_remove_recursive(bonding_debug_root);
> bonding_debug_root = NULL; <--------- SET TO NULL HERE
> bond_netlink_fini()
> rtnl_link_unregister()
> __rtnl_link_unregister()
> unregister_netdevice_many_notify()
> bond_uninit()
> bond_debug_unregister()
> (commit removed check for bonding_debug_root == NULL)
> debugfs_remove()
> simple_recursive_removal()
> down_write() -> OOPS
>
>However, reverting the bad commit does not solve the problem completely
>because the original code contains a race that could cause the same
>oops, although it was much less likely to be triggered unintentionally:
>
>CPU1
> rmmod bonding
> bonding_exit()
> bond_destroy_debugfs()
> debugfs_remove_recursive(bonding_debug_root);
>
>CPU2
> echo -bond0 > /sys/class/net/bonding_masters
> bond_uninit()
> bond_debug_unregister()
> if (!bonding_debug_root)
>
>CPU1
> bonding_debug_root = NULL;
>
>So do NOT revert the bad commit (since the removed checks were racy
>anyway), and instead change the order of actions taken during module
>removal. The same oops can also happen if there is an error during
>module init, so apply the same fix there.
>
>Fixes: cc317ea3d927 ("bonding: remove redundant NULL check in debugfs function")
>Cc: stable@...r.kernel.org
>Signed-off-by: Tony Battersby <tonyb@...ernetics.com>
Acked-by: Jay Vosburgh <jay.vosburgh@...onical.com>
>---
> drivers/net/bonding/bond_main.c | 13 +++++++------
> 1 file changed, 7 insertions(+), 6 deletions(-)
>
>diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
>index 2c5ed0a7cb18..bceda85f0dcf 100644
>--- a/drivers/net/bonding/bond_main.c
>+++ b/drivers/net/bonding/bond_main.c
>@@ -6477,16 +6477,16 @@ static int __init bonding_init(void)
> if (res)
> goto out;
>
>+ bond_create_debugfs();
>+
> res = register_pernet_subsys(&bond_net_ops);
> if (res)
>- goto out;
>+ goto err_net_ops;
>
> res = bond_netlink_init();
> if (res)
> goto err_link;
>
>- bond_create_debugfs();
>-
> for (i = 0; i < max_bonds; i++) {
> res = bond_create(&init_net, NULL);
> if (res)
>@@ -6501,10 +6501,11 @@ static int __init bonding_init(void)
> out:
> return res;
> err:
>- bond_destroy_debugfs();
> bond_netlink_fini();
> err_link:
> unregister_pernet_subsys(&bond_net_ops);
>+err_net_ops:
>+ bond_destroy_debugfs();
> goto out;
>
> }
>@@ -6513,11 +6514,11 @@ static void __exit bonding_exit(void)
> {
> unregister_netdevice_notifier(&bond_netdev_notifier);
>
>- bond_destroy_debugfs();
>-
> bond_netlink_fini();
> unregister_pernet_subsys(&bond_net_ops);
>
>+ bond_destroy_debugfs();
>+
> #ifdef CONFIG_NET_POLL_CONTROLLER
> /* Make sure we don't have an imbalance on our netpoll blocking */
> WARN_ON(atomic_read(&netpoll_block_tx));
>
>base-commit: a38297e3fb012ddfa7ce0321a7e5a8daeb1872b6
>--
>2.25.1
>
>
Powered by blists - more mailing lists