[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ebf7bce8-0856-2a07-0d29-edbcd1b76942@gmail.com>
Date: Wed, 13 Sep 2023 10:25:18 -0700
From: Florian Fainelli <f.fainelli@...il.com>
To: Guenter Roeck <linux@...ck-us.net>,
Johannes Berg <johannes@...solutions.net>,
Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>,
Heyi Guo <guoheyi@...ux.alibaba.com>
Cc: Lai Jiangshan <jiangshanlai@...il.com>, Tejun Heo <tj@...nel.org>,
Hillf Danton <hdanton@...a.com>,
LKML <linux-kernel@...r.kernel.org>,
Heyi Guo <guoheyi@...ux.alibaba.com>, netdev@...r.kernel.org
Subject: Re: [PATCH v3] workqueue: don't skip lockdep work dependency in
cancel_work_sync()
On 9/13/23 08:59, Guenter Roeck wrote:
> On 9/13/23 07:41, Johannes Berg wrote:
>> Hi Guenter,
>>
>>> This patch results in the attached lockdep splat when running the
>>> ast2600-evb emulation in qemu with aspeed_g5_defconfig and lock
>>> debugging
>>> enabled. Reverting this patch fixes the problem.
>>
>> Umm ... That's only true if you think the problem is the lockdep splat,
>> rather than the actual potential deadlock?!
>>
>
> It was hard for me to say because the workqueue lock doesn't exist
> in the first place if lockdep debugging is not enabled.
>
>>> [ 9.809960] ======================================================
>>> [ 9.810053] WARNING: possible circular locking dependency detected
>>> [ 9.810196] 6.6.0-rc1-00004-g6faca50f629f #1 Tainted:
>>> G N
>>
>> I don't have this exact tree, but on 6.6-rc1,
>>
>
> Meh, I just included a couple of bug fixes not yet available in 6.6-rc1.
>
>>> [ 9.810327] ------------------------------------------------------
>>> [ 9.810406] ip/357 is trying to acquire lock:
>>> [ 9.810501] 83af6c40
>>> ((work_completion)(&(&dev->state_queue)->work)){+.+.}-{0:0}, at:
>>> __flush_work+0x40/0x550
>>> [ 9.811052]
>>> [ 9.811052] but task is already holding lock:
>>> [ 9.811133] 81639924 (rtnl_mutex){+.+.}-{3:3}, at:
>>> rtnetlink_rcv_msg+0x124/0x514
>>> [ 9.811264]
>>> [ 9.811264] which lock already depends on the new lock.
>>> [ 9.811264]
>>> [ 9.811361]
>>> [ 9.811361] the existing dependency chain (in reverse order) is:
>>> [ 9.811466]
>>> [ 9.811466] -> #1 (rtnl_mutex){+.+.}-{3:3}:
>>> [ 9.811616] lock_acquire+0xfc/0x368
>>> [ 9.811717] __mutex_lock+0x90/0xf00
>>> [ 9.811782] mutex_lock_nested+0x24/0x2c
>>> [ 9.811845] ftgmac100_reset+0x1c/0x1dc
>>
>>
>> This does indeed take the RTNL:
>>
>> static void ftgmac100_reset(struct ftgmac100 *priv)
>> {
>> struct net_device *netdev = priv->netdev;
>> int err;
>>
>> netdev_dbg(netdev, "Resetting NIC...\n");
>>
>> /* Lock the world */
>> rtnl_lock();
>>
>> and is called from
>>
>>> [ 9.811907] ftgmac100_adjust_link+0xc0/0x13c
>>> [ 9.811972] phy_link_change+0x30/0x5c
>>> [ 9.812035] phy_check_link_status+0x9c/0x11c
>>> [ 9.812100] phy_state_machine+0x1c0/0x2c0
>>
>> this work (phy_state_machine is the function), which
>>
>>> [ 9.812405] -> #0
>>> ((work_completion)(&(&dev->state_queue)->work)){+.+.}-{0:0}:
>>> [ 9.812531] check_prev_add+0x128/0x15ec
>>> [ 9.812594] __lock_acquire+0x16ec/0x20cc
>>> [ 9.812656] lock_acquire+0xfc/0x368
>>> [ 9.812712] __flush_work+0x70/0x550
>>> [ 9.812769] __cancel_work_timer+0x1e4/0x264
>>> [ 9.812833] phy_stop+0x78/0x128
>>
>> is cancelled by phy_stop() in phy_stop_machine():
>>
>> void phy_stop_machine(struct phy_device *phydev)
>> {
>> cancel_delayed_work_sync(&phydev->state_queue);
>>
>> but of course that's called by the driver under RTNL:
>>
>>> [ 9.812889] ftgmac100_stop+0x5c/0xac
>>> [ 9.812949] __dev_close_many+0xb8/0x140
>>
>> (__dev_close_many requires RTNL)
>>
>>
>> So you have a potential deadlock in this driver. Yes, workqueue items
>> and RTNL are basically incompatible. Don't do that. Now this bug was
>> _probably_ added by commit 1baf2e50e48f ("drivers/net/ftgmac100: fix
>> DHCP potential failure with systemd") which added a call to
>> ftgmac100_reset() in ftgmac100_adjust_link() which is the thing called
>> from the PHY state machine in the first place.
>>
>> Should that be reverted? I don't know ... maybe it can be fixed
>> differently.
>>
>>
>> But anyway ... as far as lockdep/workqueue stuff is concerned I'd
>> definitely call it a win rather than a bug! Yay for making lockdep
>> useful - it found a deadlock situation for you! :-) No need to blame
>> lockdep for that :P
>>
>
> So you are saying that anything running in a workqueue must not
> acquire rtnl_lock because cancel_[delayed_]work_sync() may be called
> under rtnl_lock.
>
> Fair point, but is that documented somewhere ? If not, how is anyone
> supposed to know ? If it is not documented, I might we well argue that
> cancel_[delayed_]work_sync() should not be called with rtnl_lock held
> because some worker might hold that lock.
>
> FWIW, it would be nice if the lockdep code would generate some other
> message in this situation. Complaining about a deadlock involving a
> lock that doesn't exist if lock debugging isn't enabled is not really
> helpful and, yes, may result in reporters to falsely assume that this
> lock is responsible for the potential deadlock.
>
> Reverting 1baf2e50e48f does fix the problem as well.
I would refrain from reverting without giving a fighting chance to the
author to address it. It seems a bit strange that we do this locking
dance while it might have been simpler to introduce a
ftgmac100_reset_unlocked() and ftgmac100_reset() and call both at the
intended places, something like the completely untested patch attached
maybe?
--
Florian
View attachment "1.patch" of type "text/x-patch" (1905 bytes)
Powered by blists - more mailing lists