lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0d644b94-c674-429b-9ed8-64cb89f168f8@kernel.org>
Date: Wed, 17 Sep 2025 09:05:13 +0900
From: Krzysztof Kozlowski <krzk@...nel.org>
To: Alexey Klimov <alexey.klimov@...aro.org>,
 Praveen Talari <praveen.talari@....qualcomm.com>,
 Praveen Talari <quic_ptalari@...cinc.com>
Cc: Jorge Ramirez <jorge.ramirez@....qualcomm.com>,
 Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
 Jiri Slaby <jirislaby@...nel.org>,
 Bryan O'Donoghue <bryan.odonoghue@...aro.org>,
 linux-arm-msm@...r.kernel.org, linux-kernel@...r.kernel.org,
 linux-serial@...r.kernel.org, psodagud@...cinc.com, djaggi@...cinc.com,
 quic_msavaliy@...cinc.com, quic_vtanuku@...cinc.com,
 quic_arandive@...cinc.com, quic_shazhuss@...cinc.com
Subject: Re: [PATCH v1] serial: qcom-geni: Fix pinctrl deadlock on runtime
 resume

On 17/09/2025 19:12, Alexey Klimov wrote:
> Hi Praveen,
> 
> On Tue Sep 16, 2025 at 4:07 PM BST, Jorge Ramirez wrote:
>> On 16/09/25 16:39:00, Jorge Ramirez wrote:
>>> On 16/09/25 12:20:25, Praveen Talari wrote:
>>>> Hi Alexey
>>>>
>>>> Thank you for your support.
>>>>
>>>> On 9/15/2025 7:55 PM, Praveen Talari wrote:
>>>>> Hi Alexey,
>>>>>
>>>>> On 9/15/2025 3:09 PM, Alexey Klimov wrote:
>>>>>> (removing <quic_mnaresh@...cinc.com> from c/c -- too many mail not
>>>>>> delivered)
>>>>>>
>>>>>> Hi Praveen,
>>>>>>
>>>>>> On Mon Sep 15, 2025 at 7:58 AM BST, Praveen Talari wrote:
>>>>>>> Hi Alexey,
>>>>>>>
>>>>>>> Really appreciate you waiting!
>>>>>>>
>>>>>>> On 9/11/2025 2:30 PM, Alexey Klimov wrote:
>>>>>>>> Hi Praveen,
>>>>>>>>
>>>>>>>> On Thu Sep 11, 2025 at 9:34 AM BST, Praveen Talari wrote:
>>>>>>>>> Hi Alexy,
>>>>>>>>>
>>>>>>>>> Thank you for update.
>>>>>>>>>
>>>>>>>>> On 9/10/2025 1:35 AM, Alexey Klimov wrote:
>>>>>>>>>>
>>>>>>>>>> (adding Krzysztof to c/c)
>>>>>>>>>>
>>>>>>>>>> On Mon Sep 8, 2025 at 6:43 PM BST, Alexey Klimov wrote:
>>>>>>>>>>> On Mon Sep 8, 2025 at 5:45 PM BST, Praveen Talari wrote:
>>>>>>>>>>>> A deadlock is observed in the
>>>>>>>>>>>> qcom_geni_serial driver during runtime
>>>>>>>>>>>> resume. This occurs when the pinctrl
>>>>>>>>>>>> subsystem reconfigures device pins
>>>>>>>>>>>> via msm_pinmux_set_mux() while the serial device's interrupt is an
>>>>>>>>>>>> active wakeup source. msm_pinmux_set_mux() calls disable_irq() or
>>>>>>>>>>>> __synchronize_irq(), conflicting with the active wakeup state and
>>>>>>>>>>>> causing the IRQ thread to enter an uninterruptible (D-state) sleep,
>>>>>>>>>>>> leading to system instability.
>>>>>>>>>>>>
>>>>>>>>>>>> The critical call trace leading to the deadlock is:
>>>>>>>>>>>>
>>>>>>>>>>>>        Call trace:
>>>>>>>>>>>>        __switch_to+0xe0/0x120
>>>>>>>>>>>>        __schedule+0x39c/0x978
>>>>>>>>>>>>        schedule+0x5c/0xf8
>>>>>>>>>>>>        __synchronize_irq+0x88/0xb4
>>>>>>>>>>>>        disable_irq+0x3c/0x4c
>>>>>>>>>>>>        msm_pinmux_set_mux+0x508/0x644
>>>>>>>>>>>>        pinmux_enable_setting+0x190/0x2dc
>>>>>>>>>>>>        pinctrl_commit_state+0x13c/0x208
>>>>>>>>>>>>        pinctrl_pm_select_default_state+0x4c/0xa4
>>>>>>>>>>>>        geni_se_resources_on+0xe8/0x154
>>>>>>>>>>>>        qcom_geni_serial_runtime_resume+0x4c/0x88
>>>>>>>>>>>>        pm_generic_runtime_resume+0x2c/0x44
>>>>>>>>>>>>        __genpd_runtime_resume+0x30/0x80
>>>>>>>>>>>>        genpd_runtime_resume+0x114/0x29c
>>>>>>>>>>>>        __rpm_callback+0x48/0x1d8
>>>>>>>>>>>>        rpm_callback+0x6c/0x78
>>>>>>>>>>>>        rpm_resume+0x530/0x750
>>>>>>>>>>>>        __pm_runtime_resume+0x50/0x94
>>>>>>>>>>>>        handle_threaded_wake_irq+0x30/0x94
>>>>>>>>>>>>        irq_thread_fn+0x2c/xa8
>>>>>>>>>>>>        irq_thread+0x160/x248
>>>>>>>>>>>>        kthread+0x110/x114
>>>>>>>>>>>>        ret_from_fork+0x10/x20
>>>>>>>>>>>>
>>>>>>>>>>>> To resolve this, explicitly manage the wakeup IRQ state within the
>>>>>>>>>>>> runtime suspend/resume callbacks. In the
>>>>>>>>>>>> runtime resume callback, call
>>>>>>>>>>>> disable_irq_wake() before enabling resources. This preemptively
>>>>>>>>>>>> removes the "wakeup" capability from the IRQ, allowing subsequent
>>>>>>>>>>>> interrupt management calls to proceed
>>>>>>>>>>>> without conflict. An error path
>>>>>>>>>>>> re-enables the wakeup IRQ if resource enablement fails.
>>>>>>>>>>>>
>>>>>>>>>>>> Conversely, in runtime suspend, call
>>>>>>>>>>>> enable_irq_wake() after resources
>>>>>>>>>>>> are disabled. This ensures the interrupt is configured as a wakeup
>>>>>>>>>>>> source only once the device has fully
>>>>>>>>>>>> entered its low-power state. An
>>>>>>>>>>>> error path handles disabling the wakeup IRQ
>>>>>>>>>>>> if the suspend operation
>>>>>>>>>>>> fails.
>>>>>>>>>>>>
>>>>>>>>>>>> Fixes: 1afa70632c39 ("serial: qcom-geni:
>>>>>>>>>>>> Enable PM runtime for serial driver")
>>>>>>>>>>>> Signed-off-by: Praveen Talari <praveen.talari@....qualcomm.com>
>>>>>>>>>>>
>>>>>>>>>>> You forgot:
>>>>>>>>>>>
>>>>>>>>>>> Reported-by: Alexey Klimov <alexey.klimov@...aro.org>
>>>>>>>>>>>
>>>>>>>>>>> Also, not sure where this change will go, via
>>>>>>>>>>> Greg or Jiri, but ideally
>>>>>>>>>>> this should be picked for current -rc cycle since regression is
>>>>>>>>>>> introduced during latest merge window.
>>>>>>>>>>>
>>>>>>>>>>> I also would like to test it on qrb2210 rb1 where this regression is
>>>>>>>>>>> reproduciable.
>>>>
>>>> Since I don't have this board, could you kindly validate the new change and
>>>> run a quick test on your end?
>>>>
>>>> diff --git a/drivers/pinctrl/qcom/pinctrl-msm.c
>>>> b/drivers/pinctrl/qcom/pinctrl-msm.c
>>>> index 83eb075b6bfa..3d6601dc6fcc 100644
>>>> --- a/drivers/pinctrl/qcom/pinctrl-msm.c
>>>> +++ b/drivers/pinctrl/qcom/pinctrl-msm.c
>>>> @@ -215,7 +215,7 @@ static int msm_pinmux_set_mux(struct pinctrl_dev
>>>> *pctldev,
>>>>          */
>>>>         if (d && i != gpio_func &&
>>>>             !test_and_set_bit(d->hwirq, pctrl->disabled_for_mux))
>>>> -               disable_irq(irq);
>>>> +               disable_irq_nosync(irq);
>>>>
>>>>         raw_spin_lock_irqsave(&pctrl->lock, flags);
>>>
>>>
>>> sorry Praveen, didnt see this proposal. testing on my end as well.
>>>
>>
>> just tested on my end and all modules load - deadlocked before this
>> update so there is progress (now we can load the network driver)
> 
> Is it supposed to be orginal patch here plus disable_irq_nosync()?
> Meaning changes for qcom_geni_serial_runtime_{suspend,resume}
> + disable_irq_nosync() in msm_pinmux_set_mux()?
> 
> It seems to work here but let me know few more runs.


So this bug, after 5 weeks is still not fixed?!?

This is just and should be reverted long time ago.

Best regards,
Krzysztof

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ