[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aMl9Fbuyq7hdXvQC@trex>
Date: Tue, 16 Sep 2025 17:07:01 +0200
From: Jorge Ramirez <jorge.ramirez@....qualcomm.com>
To: Jorge Ramirez <jorge.ramirez@....qualcomm.com>
Cc: Praveen Talari <praveen.talari@....qualcomm.com>,
Alexey Klimov <alexey.klimov@...aro.org>,
Praveen Talari <quic_ptalari@...cinc.com>,
Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
Jiri Slaby <jirislaby@...nel.org>,
Bryan O'Donoghue <bryan.odonoghue@...aro.org>,
linux-arm-msm@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-serial@...r.kernel.org, psodagud@...cinc.com, djaggi@...cinc.com,
quic_msavaliy@...cinc.com, quic_vtanuku@...cinc.com,
quic_arandive@...cinc.com, quic_shazhuss@...cinc.com, krzk@...nel.org
Subject: Re: [PATCH v1] serial: qcom-geni: Fix pinctrl deadlock on runtime
resume
On 16/09/25 16:39:00, Jorge Ramirez wrote:
> On 16/09/25 12:20:25, Praveen Talari wrote:
> > Hi Alexey
> >
> > Thank you for your support.
> >
> > On 9/15/2025 7:55 PM, Praveen Talari wrote:
> > > Hi Alexey,
> > >
> > > On 9/15/2025 3:09 PM, Alexey Klimov wrote:
> > > > (removing <quic_mnaresh@...cinc.com> from c/c -- too many mail not
> > > > delivered)
> > > >
> > > > Hi Praveen,
> > > >
> > > > On Mon Sep 15, 2025 at 7:58 AM BST, Praveen Talari wrote:
> > > > > Hi Alexey,
> > > > >
> > > > > Really appreciate you waiting!
> > > > >
> > > > > On 9/11/2025 2:30 PM, Alexey Klimov wrote:
> > > > > > Hi Praveen,
> > > > > >
> > > > > > On Thu Sep 11, 2025 at 9:34 AM BST, Praveen Talari wrote:
> > > > > > > Hi Alexy,
> > > > > > >
> > > > > > > Thank you for update.
> > > > > > >
> > > > > > > On 9/10/2025 1:35 AM, Alexey Klimov wrote:
> > > > > > > >
> > > > > > > > (adding Krzysztof to c/c)
> > > > > > > >
> > > > > > > > On Mon Sep 8, 2025 at 6:43 PM BST, Alexey Klimov wrote:
> > > > > > > > > On Mon Sep 8, 2025 at 5:45 PM BST, Praveen Talari wrote:
> > > > > > > > > > A deadlock is observed in the
> > > > > > > > > > qcom_geni_serial driver during runtime
> > > > > > > > > > resume. This occurs when the pinctrl
> > > > > > > > > > subsystem reconfigures device pins
> > > > > > > > > > via msm_pinmux_set_mux() while the serial device's interrupt is an
> > > > > > > > > > active wakeup source. msm_pinmux_set_mux() calls disable_irq() or
> > > > > > > > > > __synchronize_irq(), conflicting with the active wakeup state and
> > > > > > > > > > causing the IRQ thread to enter an uninterruptible (D-state) sleep,
> > > > > > > > > > leading to system instability.
> > > > > > > > > >
> > > > > > > > > > The critical call trace leading to the deadlock is:
> > > > > > > > > >
> > > > > > > > > > Call trace:
> > > > > > > > > > __switch_to+0xe0/0x120
> > > > > > > > > > __schedule+0x39c/0x978
> > > > > > > > > > schedule+0x5c/0xf8
> > > > > > > > > > __synchronize_irq+0x88/0xb4
> > > > > > > > > > disable_irq+0x3c/0x4c
> > > > > > > > > > msm_pinmux_set_mux+0x508/0x644
> > > > > > > > > > pinmux_enable_setting+0x190/0x2dc
> > > > > > > > > > pinctrl_commit_state+0x13c/0x208
> > > > > > > > > > pinctrl_pm_select_default_state+0x4c/0xa4
> > > > > > > > > > geni_se_resources_on+0xe8/0x154
> > > > > > > > > > qcom_geni_serial_runtime_resume+0x4c/0x88
> > > > > > > > > > pm_generic_runtime_resume+0x2c/0x44
> > > > > > > > > > __genpd_runtime_resume+0x30/0x80
> > > > > > > > > > genpd_runtime_resume+0x114/0x29c
> > > > > > > > > > __rpm_callback+0x48/0x1d8
> > > > > > > > > > rpm_callback+0x6c/0x78
> > > > > > > > > > rpm_resume+0x530/0x750
> > > > > > > > > > __pm_runtime_resume+0x50/0x94
> > > > > > > > > > handle_threaded_wake_irq+0x30/0x94
> > > > > > > > > > irq_thread_fn+0x2c/xa8
> > > > > > > > > > irq_thread+0x160/x248
> > > > > > > > > > kthread+0x110/x114
> > > > > > > > > > ret_from_fork+0x10/x20
> > > > > > > > > >
> > > > > > > > > > To resolve this, explicitly manage the wakeup IRQ state within the
> > > > > > > > > > runtime suspend/resume callbacks. In the
> > > > > > > > > > runtime resume callback, call
> > > > > > > > > > disable_irq_wake() before enabling resources. This preemptively
> > > > > > > > > > removes the "wakeup" capability from the IRQ, allowing subsequent
> > > > > > > > > > interrupt management calls to proceed
> > > > > > > > > > without conflict. An error path
> > > > > > > > > > re-enables the wakeup IRQ if resource enablement fails.
> > > > > > > > > >
> > > > > > > > > > Conversely, in runtime suspend, call
> > > > > > > > > > enable_irq_wake() after resources
> > > > > > > > > > are disabled. This ensures the interrupt is configured as a wakeup
> > > > > > > > > > source only once the device has fully
> > > > > > > > > > entered its low-power state. An
> > > > > > > > > > error path handles disabling the wakeup IRQ
> > > > > > > > > > if the suspend operation
> > > > > > > > > > fails.
> > > > > > > > > >
> > > > > > > > > > Fixes: 1afa70632c39 ("serial: qcom-geni:
> > > > > > > > > > Enable PM runtime for serial driver")
> > > > > > > > > > Signed-off-by: Praveen Talari <praveen.talari@....qualcomm.com>
> > > > > > > > >
> > > > > > > > > You forgot:
> > > > > > > > >
> > > > > > > > > Reported-by: Alexey Klimov <alexey.klimov@...aro.org>
> > > > > > > > >
> > > > > > > > > Also, not sure where this change will go, via
> > > > > > > > > Greg or Jiri, but ideally
> > > > > > > > > this should be picked for current -rc cycle since regression is
> > > > > > > > > introduced during latest merge window.
> > > > > > > > >
> > > > > > > > > I also would like to test it on qrb2210 rb1 where this regression is
> > > > > > > > > reproduciable.
> >
> > Since I don't have this board, could you kindly validate the new change and
> > run a quick test on your end?
> >
> > diff --git a/drivers/pinctrl/qcom/pinctrl-msm.c
> > b/drivers/pinctrl/qcom/pinctrl-msm.c
> > index 83eb075b6bfa..3d6601dc6fcc 100644
> > --- a/drivers/pinctrl/qcom/pinctrl-msm.c
> > +++ b/drivers/pinctrl/qcom/pinctrl-msm.c
> > @@ -215,7 +215,7 @@ static int msm_pinmux_set_mux(struct pinctrl_dev
> > *pctldev,
> > */
> > if (d && i != gpio_func &&
> > !test_and_set_bit(d->hwirq, pctrl->disabled_for_mux))
> > - disable_irq(irq);
> > + disable_irq_nosync(irq);
> >
> > raw_spin_lock_irqsave(&pctrl->lock, flags);
>
>
> sorry Praveen, didnt see this proposal. testing on my end as well.
>
just tested on my end and all modules load - deadlocked before this
update so there is progress (now we can load the network driver)
I can see however irq/92 (threaded) stuck in D-state inside runtime pm
root@...2210-rb1-core-kit:~# echo w > /proc/sysrq-trigger
[ 498.247349] sysrq: Show Blocked State
[ 498.251190] task:irq/92-4a8c000. state:D stack:0 pid:80
tgid:80 ppid:2 task_flags:0x208040 flags:0x00000010
[ 498.262334] Call trace:
[ 498.264812] __switch_to+0xf0/0x1c0 (T)
[ 498.268777] __schedule+0x110/0x9bc
with irq92 being:
92: 199870 0 0 0 msmgpio 11 Level 4a8c000.serial:wakeup
this log changes over time but it is alwas irq/92:
root@...2210-rb1-core-kit:~# echo w > /proc/sysrq-trigger [90/66818]
[ 613.019101] sysrq: Show Blocked State
[ 613.023055] task:irq/92-4a8c000. state:D stack:0 pid:80 tgid:80 ppid:2 task_flags:0x208040 flags:0x00000010
[ 613.034189] Call trace:
[ 613.036770] __switch_to+0xf0/0x1c0 (T)
[ 613.040779] __schedule+0x35c/0x9bc
[ 613.044412] schedule+0x34/0x110
[ 613.047782] rpm_resume+0x17c/0x690
[ 613.051359] __pm_runtime_resume+0x4c/0x98
[ 613.055556] handle_threaded_wake_irq+0x30/0x80
[ 613.060168] irq_thread_fn+0x28/0xa8
[ 613.063864] irq_thread+0x178/0x338
[ 613.067434] kthread+0x12c/0x210
[ 613.070735] ret_from_fork+0x10/0x20
root@...2210-rb1-core-kit:~#
root@...2210-rb1-core-kit:~# echo w > /proc/sysrq-trigger
[ 617.586960] sysrq: Show Blocked State
[ 617.590771] task:irq/92-4a8c000. state:D stack:0 pid:80 tgid:80 ppid:2 task_flags:0x208040 flags:0x00000010
[ 617.601906] Call trace:
[ 617.604442] __switch_to+0xf0/0x1c0 (T)
[ 617.608408] __schedule+0x35c/0x9bc
[ 617.612074] 0x766c7362
root@...2210-rb1-core-kit:~#
root@...2210-rb1-core-kit:~#
root@...2210-rb1-core-kit:~# echo w > /proc/sysrq-trigger
[ 619.656937] sysrq: Show Blocked State
[ 619.660847] task:irq/92-4a8c000. state:D stack:0 pid:80 tgid:80 ppid:2 task_flags:0x208040 flags:0x00000010
[ 619.672009] Call trace:
[ 619.674531] __switch_to+0xf0/0x1c0 (T)
[ 619.678508] __schedule+0x35c/0x9bc
[ 619.682102] schedule+0x34/0x110
[ 619.685488] schedule_timeout+0x80/0x104
root@...2210-rb1-core-kit:~#
root@...2210-rb1-core-kit:~#
root@...2210-rb1-core-kit:~# echo w > /proc/sysrq-trigger
[ 624.786811] sysrq: Show Blocked State
root@...2210-rb1-core-kit:~#
root@...2210-rb1-core-kit:~#
root@...2210-rb1-core-kit:~# echo w > /proc/sysrq-trigger
[ 630.546744] sysrq: Show Blocked State
[ 630.550593] task:irq/92-4a8c000. state:D stack:0 pid:80 tgid:80 ppid:2 task_flags:0x208040 flags:0x00000010
[ 630.561724] Call trace:
[ 630.564219] __switch_to+0xf0/0x1c0 (T)
[ 630.568138] __schedule+0x35c/0x9bc
[ 630.571729] 0x766c7362
root@...2210-rb1-core-kit:~#
Powered by blists - more mailing lists