[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240515114426.GJ154012@kernel.org>
Date: Wed, 15 May 2024 12:44:26 +0100
From: Simon Horman <horms@...nel.org>
To: Tony Battersby <tonyb@...ernetics.com>
Cc: "David S. Miller" <davem@...emloft.net>,
Eric Dumazet <edumazet@...gle.com>,
Jakub Kicinski <kuba@...nel.org>, Paolo Abeni <pabeni@...hat.com>,
Jay Vosburgh <j.vosburgh@...il.com>,
Andy Gospodarek <andy@...yhouse.net>,
Zhengchao Shao <shaozhengchao@...wei.com>, netdev@...r.kernel.org
Subject: Re: [PATCH net] bonding: fix oops during rmmod
On Tue, May 14, 2024 at 03:57:29PM -0400, Tony Battersby wrote:
> "rmmod bonding" causes an oops ever since commit cc317ea3d927 ("bonding:
> remove redundant NULL check in debugfs function"). Here are the relevant
> functions being called:
>
> bonding_exit()
> bond_destroy_debugfs()
> debugfs_remove_recursive(bonding_debug_root);
> bonding_debug_root = NULL; <--------- SET TO NULL HERE
> bond_netlink_fini()
> rtnl_link_unregister()
> __rtnl_link_unregister()
> unregister_netdevice_many_notify()
> bond_uninit()
> bond_debug_unregister()
> (commit removed check for bonding_debug_root == NULL)
> debugfs_remove()
> simple_recursive_removal()
> down_write() -> OOPS
>
> However, reverting the bad commit does not solve the problem completely
> because the original code contains a race that could cause the same
> oops, although it was much less likely to be triggered unintentionally:
>
> CPU1
> rmmod bonding
> bonding_exit()
> bond_destroy_debugfs()
> debugfs_remove_recursive(bonding_debug_root);
>
> CPU2
> echo -bond0 > /sys/class/net/bonding_masters
> bond_uninit()
> bond_debug_unregister()
> if (!bonding_debug_root)
>
> CPU1
> bonding_debug_root = NULL;
>
> So do NOT revert the bad commit (since the removed checks were racy
> anyway), and instead change the order of actions taken during module
> removal. The same oops can also happen if there is an error during
> module init, so apply the same fix there.
>
> Fixes: cc317ea3d927 ("bonding: remove redundant NULL check in debugfs function")
> Cc: stable@...r.kernel.org
> Signed-off-by: Tony Battersby <tonyb@...ernetics.com>
Reviewed-by: Simon Horman <horms@...nel.org>
Powered by blists - more mailing lists