lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <97f0e27f-3128-4821-bc09-2acde1ebf81a@kernel.org>
Date: Wed, 30 Jul 2025 21:43:02 +0200
From: Krzysztof Kozlowski <krzk@...nel.org>
To: "Russell King (Oracle)" <linux@...linux.org.uk>,
 Chanwoo Choi <cw00.choi@...sung.com>
Cc: Alexandre Belloni <alexandre.belloni@...tlin.com>,
 Linus Walleij <linus.walleij@...aro.org>, Bartosz Golaszewski
 <brgl@...ev.pl>, linux-gpio@...r.kernel.org, linux-rtc@...r.kernel.org,
 linux-kernel@...r.kernel.org, Thierry Reding <treding@...dia.com>
Subject: Re: [BUG] 6.16-rc7: lockdep failure with max77620-gpio/max77686-rtc

On 30/07/2025 19:58, Russell King (Oracle) wrote:
> Hi,
> 
> First, I'm not sure who is responsible for the max77620-gpio driver

77620 is only for nvidia platforms and nvidia was upstreaming it,
although it shares the RTC driver part with max77686. You should Cc
nvidia SoC maintainers, maybe Thierry has someone around who could
investigate it.

> (it's not in MAINTAINERS) but this bug points towards a problem with
> one or other of these drivers.
> 
> Here is /proc/interrupts which may help debug this:
> 
>            CPU0       CPU1       CPU2       CPU3       CPU4       CPU5
>  94:          1          0          0          0          0          0 max77620-
> top   4 Edge      max77686-rtc
>  95:          1          0          0          0          0          0 max77686-rtc   1 Edge      rtc-alarm1
> 
> While running 6.16-rc7 on the Jetson Xavier NX platform, upon suspend,
> I receive the following lockdep splat. I've added some instrumentation
> into irq_set_irq_wake() which appears twice in the calltrace to print
> the IRQ number and the "on" parameter to locate which interrupts are
> involved in this splat. This splat is 100% reproducable.
> 
> [   46.721367] irq_set_irq_wake: irq=95 on=1
> [   46.722067] irq_set_irq_wake: irq=94 on=1
> [   46.722181] ============================================
> [   46.722578] WARNING: possible recursive locking detected
> [   46.722852] 6.16.0-rc7-net-next+ #432 Not tainted
> [   46.722965] --------------------------------------------
> [   46.723127] rtcwake/3984 is trying to acquire lock:
> [   46.723235] ffff0000813b2c68 (&d->lock){+.+.}-{4:4}, at: regmap_irq_lock+0x18/0x24
> [   46.723452]
>                but task is already holding lock:
> [   46.723556] ffff00008504dc68 (&d->lock){+.+.}-{4:4}, at: regmap_irq_lock+0x18/0x24
> [   46.723780]
>                other info that might help us debug this:
> [   46.723903]  Possible unsafe locking scenario:
> 
> [   46.724015]        CPU0
> [   46.724067]        ----
> [   46.724119]   lock(&d->lock);
> [   46.724212]   lock(&d->lock);
> [   46.724282]
>                 *** DEADLOCK ***
> 
> [   46.724348]  May be due to missing lock nesting notation
> 
> [   46.724492] 6 locks held by rtcwake/3984:
> [   46.724576]  #0: ffff0000825693f8 (sb_writers#3){.+.+}-{0:0}, at: vfs_write+0x184/0x350
> [   46.724902]  #1: ffff00008fd7fa88 (&of->mutex#2){+.+.}-{4:4}, at: kernfs_fop_write_iter+0x104/0x1c8
> [   46.725258]  #2: ffff000080a64588 (kn->active#87){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x10c/0x1c8
> [   46.725609]  #3: ffff8000815d4fb8 (system_transition_mutex){+.+.}-{4:4}, at: pm_suspend+0x220/0x300
> [   46.725897]  #4: ffff00008500a8f8 (&dev->mutex){....}-{4:4}, at: device_suspend+0x1d8/0x630
> [   46.726173]  #5: ffff00008504dc68 (&d->lock){+.+.}-{4:4}, at: regmap_irq_lock+0x18/0x24


max77686 only disables/enables interrupts in suspend path, but max77620
is doing also I2C transfers, but above is regmap_irq_lock, not regmap
lock. Maybe this is not really max77620/77686 related at all? None of
these do anything weird (or different than last 5 years), so missing
nesting could be result of changes in other parts...


> [   46.732435]
>                stack backtrace:
> [   46.734019] CPU: 3 UID: 0 PID: 3984 Comm: rtcwake Not tainted 6.16.0-rc7-net-next+ #432 PREEMPT
> [   46.734029] Hardware name: NVIDIA NVIDIA Jetson Xavier NX Developer Kit/Jetson, BIOS 6.0-37391689 08/28/2024
> [   46.734033] Call trace:
> [   46.734036]  show_stack+0x18/0x24 (C)
> [   46.734070]  dump_stack_lvl+0x90/0xd0
> [   46.734080]  dump_stack+0x18/0x24
> [   46.734107]  print_deadlock_bug+0x260/0x350
> [   46.734114]  __lock_acquire+0xf28/0x2088
> [   46.734120]  lock_acquire+0x19c/0x33c
> [   46.734126]  __mutex_lock+0x84/0x530
> [   46.734135]  mutex_lock_nested+0x24/0x30
> [   46.734155]  regmap_irq_lock+0x18/0x24
> [   46.734161]  __irq_get_desc_lock+0x8c/0x9c
> [   46.734170]  irq_set_irq_wake+0x5c/0x1ac	<== I guess IRQ 94

...like changes in irqchip.

> [   46.734176]  regmap_irq_sync_unlock+0x314/0x4f4
> [   46.734182]  __irq_put_desc_unlock+0x48/0x4c
> [   46.734190]  irq_set_irq_wake+0x88/0x1ac	<== I guess IRQ 95
> [   46.734195]  max77686_rtc_suspend+0x34/0x74


Because really above part is virtually unchanged since 10 years, except
my commit d8f090dbeafdcc3d30761aa0062f19d1adf9ef08 (you can try
reverting it... but it still could be correct/needed and just irqchip
changed something around locking).

Best regards,
Krzysztof

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ