[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aEFPRWErB4QkbMkt@hovoldconsulting.com>
Date: Thu, 5 Jun 2025 10:03:17 +0200
From: Johan Hovold <johan@...nel.org>
To: Dmitry Baryshkov <dmitry.baryshkov@....qualcomm.com>
Cc: Vinod Koul <vkoul@...nel.org>,
Kishon Vijay Abraham I <kishon@...nel.org>,
Abel Vesa <abel.vesa@...aro.org>,
Neil Armstrong <neil.armstrong@...aro.org>,
linux-arm-msm@...r.kernel.org, linux-phy@...ts.infradead.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] phy: use per-PHY lockdep keys
On Fri, May 30, 2025 at 07:08:28PM +0300, Dmitry Baryshkov wrote:
> If the PHY driver uses another PHY internally (e.g. in case of eUSB2,
> repeaters are represented as PHYs), then it would trigger the following
> lockdep splat because all PHYs use a single static lockdep key and thus
> lockdep can not identify whether there is a dependency or not and
> reports a false positive.
>
> Make PHY subsystem use dynamic lockdep keys, assigning each driver a
> separate key. This way lockdep can correctly identify dependency graph
> between mutexes.
>
> ============================================
> WARNING: possible recursive locking detected
> 6.15.0-rc7-next-20250522-12896-g3932f283970c #3455 Not tainted
> --------------------------------------------
> kworker/u51:0/78 is trying to acquire lock:
> ffff0008116554f0 (&phy->mutex){+.+.}-{4:4}, at: phy_init+0x4c/0x12c
>
> but task is already holding lock:
> ffff000813c10cf0 (&phy->mutex){+.+.}-{4:4}, at: phy_init+0x4c/0x12c
>
> other info that might help us debug this:
> Possible unsafe locking scenario:
>
> CPU0
> ----
> lock(&phy->mutex);
> lock(&phy->mutex);
>
> *** DEADLOCK ***
>
> May be due to missing lock nesting notation
>
> 4 locks held by kworker/u51:0/78:
> #0: ffff000800010948 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work+0x18c/0x5ec
> #1: ffff80008036bdb0 (deferred_probe_work){+.+.}-{0:0}, at: process_one_work+0x1b4/0x5ec
> #2: ffff0008094ac8f8 (&dev->mutex){....}-{4:4}, at: __device_attach+0x38/0x188
> #3: ffff000813c10cf0 (&phy->mutex){+.+.}-{4:4}, at: phy_init+0x4c/0x12c
>
> stack backtrace:
> CPU: 0 UID: 0 PID: 78 Comm: kworker/u51:0 Not tainted 6.15.0-rc7-next-20250522-12896-g3932f283970c #3455 PREEMPT
> Hardware name: Qualcomm CRD, BIOS 6.0.240904.BOOT.MXF.2.4-00528.1-HAMOA-1 09/ 4/2024
> Workqueue: events_unbound deferred_probe_work_func
> Call trace:
> show_stack+0x18/0x24 (C)
> dump_stack_lvl+0x90/0xd0
> dump_stack+0x18/0x24
> print_deadlock_bug+0x258/0x348
> __lock_acquire+0x10fc/0x1f84
> lock_acquire+0x1c8/0x338
> __mutex_lock+0xb8/0x59c
> mutex_lock_nested+0x24/0x30
> phy_init+0x4c/0x12c
> snps_eusb2_hsphy_init+0x54/0x1a0
> phy_init+0xe0/0x12c
> dwc3_core_init+0x450/0x10b4
> dwc3_core_probe+0xce4/0x15fc
> dwc3_probe+0x64/0xb0
> platform_probe+0x68/0xc4
> really_probe+0xbc/0x298
> __driver_probe_device+0x78/0x12c
> driver_probe_device+0x3c/0x160
> __device_attach_driver+0xb8/0x138
> bus_for_each_drv+0x84/0xe0
> __device_attach+0x9c/0x188
> device_initial_probe+0x14/0x20
> bus_probe_device+0xac/0xb0
> deferred_probe_work_func+0x8c/0xc8
> process_one_work+0x208/0x5ec
> worker_thread+0x1c0/0x368
> kthread+0x14c/0x20c
> ret_from_fork+0x10/0x20
Nit: This last bit of the stack trace adds little value and can be
dropped.
> Fixes: 3584f6392f09 ("phy: qcom: phy-qcom-snps-eusb2: Add support for eUSB2 repeater")
> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@....qualcomm.com>
> ---
> Note: I've used a Fixes tag pointing to the commit which actually
> started using nested PHYs. If you think that it's incorrect, I'm fine
> with dropping it.
I think it's warranted. And if there were further users before this one
as Neil suggested you could just list them all as each has been
introducing a new splat.
> Note2: I've tried using mutex_lock_nested, however that didn't play
> well. We can not store nest level in the struct phy (as it can be used
> by different drivers), so using mutex_lock_nested() would require us to
> change and wrap all PHY APIs which take a lock internally. Using dynamic
> lockdep keys looks like a more ellegant solution, especially granted
> that there is no extra impact if lockdep is disabled.
Thanks for fixing this. I've been using a local hack based on
mutex_lock_nested() too but dynamic keys looks like the right way to go.
Perhaps you can add:
Reported-by: Johan Hovold <johan+linaro@...nel.org>
Link: https://lore.kernel.org/lkml/ZnpoAVGJMG4Zu-Jw@hovoldconsulting.com/
Works fine on the T14s:
Reviewed-by: Johan Hovold <johan+linaro@...nel.org>
Tested-by: Johan Hovold <johan+linaro@...nel.org>
Johan
Powered by blists - more mailing lists