[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <97f0e27f-3128-4821-bc09-2acde1ebf81a@kernel.org>
Date: Wed, 30 Jul 2025 21:43:02 +0200
From: Krzysztof Kozlowski <krzk@...nel.org>
To: "Russell King (Oracle)" <linux@...linux.org.uk>,
Chanwoo Choi <cw00.choi@...sung.com>
Cc: Alexandre Belloni <alexandre.belloni@...tlin.com>,
Linus Walleij <linus.walleij@...aro.org>, Bartosz Golaszewski
<brgl@...ev.pl>, linux-gpio@...r.kernel.org, linux-rtc@...r.kernel.org,
linux-kernel@...r.kernel.org, Thierry Reding <treding@...dia.com>
Subject: Re: [BUG] 6.16-rc7: lockdep failure with max77620-gpio/max77686-rtc
On 30/07/2025 19:58, Russell King (Oracle) wrote:
> Hi,
>
> First, I'm not sure who is responsible for the max77620-gpio driver
77620 is only for nvidia platforms and nvidia was upstreaming it,
although it shares the RTC driver part with max77686. You should Cc
nvidia SoC maintainers, maybe Thierry has someone around who could
investigate it.
> (it's not in MAINTAINERS) but this bug points towards a problem with
> one or other of these drivers.
>
> Here is /proc/interrupts which may help debug this:
>
> CPU0 CPU1 CPU2 CPU3 CPU4 CPU5
> 94: 1 0 0 0 0 0 max77620-
> top 4 Edge max77686-rtc
> 95: 1 0 0 0 0 0 max77686-rtc 1 Edge rtc-alarm1
>
> While running 6.16-rc7 on the Jetson Xavier NX platform, upon suspend,
> I receive the following lockdep splat. I've added some instrumentation
> into irq_set_irq_wake() which appears twice in the calltrace to print
> the IRQ number and the "on" parameter to locate which interrupts are
> involved in this splat. This splat is 100% reproducable.
>
> [ 46.721367] irq_set_irq_wake: irq=95 on=1
> [ 46.722067] irq_set_irq_wake: irq=94 on=1
> [ 46.722181] ============================================
> [ 46.722578] WARNING: possible recursive locking detected
> [ 46.722852] 6.16.0-rc7-net-next+ #432 Not tainted
> [ 46.722965] --------------------------------------------
> [ 46.723127] rtcwake/3984 is trying to acquire lock:
> [ 46.723235] ffff0000813b2c68 (&d->lock){+.+.}-{4:4}, at: regmap_irq_lock+0x18/0x24
> [ 46.723452]
> but task is already holding lock:
> [ 46.723556] ffff00008504dc68 (&d->lock){+.+.}-{4:4}, at: regmap_irq_lock+0x18/0x24
> [ 46.723780]
> other info that might help us debug this:
> [ 46.723903] Possible unsafe locking scenario:
>
> [ 46.724015] CPU0
> [ 46.724067] ----
> [ 46.724119] lock(&d->lock);
> [ 46.724212] lock(&d->lock);
> [ 46.724282]
> *** DEADLOCK ***
>
> [ 46.724348] May be due to missing lock nesting notation
>
> [ 46.724492] 6 locks held by rtcwake/3984:
> [ 46.724576] #0: ffff0000825693f8 (sb_writers#3){.+.+}-{0:0}, at: vfs_write+0x184/0x350
> [ 46.724902] #1: ffff00008fd7fa88 (&of->mutex#2){+.+.}-{4:4}, at: kernfs_fop_write_iter+0x104/0x1c8
> [ 46.725258] #2: ffff000080a64588 (kn->active#87){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x10c/0x1c8
> [ 46.725609] #3: ffff8000815d4fb8 (system_transition_mutex){+.+.}-{4:4}, at: pm_suspend+0x220/0x300
> [ 46.725897] #4: ffff00008500a8f8 (&dev->mutex){....}-{4:4}, at: device_suspend+0x1d8/0x630
> [ 46.726173] #5: ffff00008504dc68 (&d->lock){+.+.}-{4:4}, at: regmap_irq_lock+0x18/0x24
max77686 only disables/enables interrupts in suspend path, but max77620
is doing also I2C transfers, but above is regmap_irq_lock, not regmap
lock. Maybe this is not really max77620/77686 related at all? None of
these do anything weird (or different than last 5 years), so missing
nesting could be result of changes in other parts...
> [ 46.732435]
> stack backtrace:
> [ 46.734019] CPU: 3 UID: 0 PID: 3984 Comm: rtcwake Not tainted 6.16.0-rc7-net-next+ #432 PREEMPT
> [ 46.734029] Hardware name: NVIDIA NVIDIA Jetson Xavier NX Developer Kit/Jetson, BIOS 6.0-37391689 08/28/2024
> [ 46.734033] Call trace:
> [ 46.734036] show_stack+0x18/0x24 (C)
> [ 46.734070] dump_stack_lvl+0x90/0xd0
> [ 46.734080] dump_stack+0x18/0x24
> [ 46.734107] print_deadlock_bug+0x260/0x350
> [ 46.734114] __lock_acquire+0xf28/0x2088
> [ 46.734120] lock_acquire+0x19c/0x33c
> [ 46.734126] __mutex_lock+0x84/0x530
> [ 46.734135] mutex_lock_nested+0x24/0x30
> [ 46.734155] regmap_irq_lock+0x18/0x24
> [ 46.734161] __irq_get_desc_lock+0x8c/0x9c
> [ 46.734170] irq_set_irq_wake+0x5c/0x1ac <== I guess IRQ 94
...like changes in irqchip.
> [ 46.734176] regmap_irq_sync_unlock+0x314/0x4f4
> [ 46.734182] __irq_put_desc_unlock+0x48/0x4c
> [ 46.734190] irq_set_irq_wake+0x88/0x1ac <== I guess IRQ 95
> [ 46.734195] max77686_rtc_suspend+0x34/0x74
Because really above part is virtually unchanged since 10 years, except
my commit d8f090dbeafdcc3d30761aa0062f19d1adf9ef08 (you can try
reverting it... but it still could be correct/needed and just irqchip
changed something around locking).
Best regards,
Krzysztof
Powered by blists - more mailing lists