[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <81dcbf72-92bb-093a-da48-89a73ead820e@quicinc.com>
Date: Thu, 25 Aug 2022 19:22:43 +0530
From: Krishna Chaitanya Chundru <quic_krichai@...cinc.com>
To: Stephen Boyd <swboyd@...omium.org>, <helgaas@...nel.org>
CC: <linux-pci@...r.kernel.org>, <linux-arm-msm@...r.kernel.org>,
<linux-kernel@...r.kernel.org>, <mka@...omium.org>,
<quic_vbadigan@...cinc.com>, <quic_hemantk@...cinc.com>,
<quic_nitegupt@...cinc.com>, <quic_skananth@...cinc.com>,
<quic_ramkri@...cinc.com>, <manivannan.sadhasivam@...aro.org>,
<dmitry.baryshkov@...aro.org>, Jingoo Han <jingoohan1@...il.com>,
"Gustavo Pimentel" <gustavo.pimentel@...opsys.com>,
Lorenzo Pieralisi <lpieralisi@...nel.org>,
Rob Herring <robh@...nel.org>,
Krzysztof Wilczyński <kw@...ux.com>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Andy Gross <agross@...nel.org>,
Bjorn Andersson <bjorn.andersson@...aro.org>,
Stanimir Varbanov <svarbanov@...sol.com>
Subject: Re: [PATCH v5 2/3] PCI: qcom: Restrict pci transactions after pci
suspend
On 8/24/2022 10:50 PM, Stephen Boyd wrote:
> Quoting Krishna Chaitanya Chundru (2022-08-23 20:37:59)
>> On 8/9/2022 12:42 AM, Stephen Boyd wrote:
>>> Quoting Krishna chaitanya chundru (2022-08-03 04:28:53)
>>>> If the endpoint device state is D0 and irq's are not freed, then
>>>> kernel try to mask interrupts in system suspend path by writing
>>>> in to the vector table (for MSIX interrupts) and config space (for MSI's).
>>>>
>>>> These transactions are initiated in the pm suspend after pcie clocks got
>>>> disabled as part of platform driver pm suspend call. Due to it, these
>>>> transactions are resulting in un-clocked access and eventually to crashes.
>>> Why are the platform driver pm suspend calls disabling clks that early?
>>> Can they disable clks in noirq phase, or even later, so that we don't
>>> have to check if the device is clocking in the irq poking functions?
>>> It's best to keep irq operations fast, so that irq control is fast given
>>> that these functions are called from irq flow handlers.
>> We are registering the pcie pm suspend ops as noirq ops only. And this
>> msix and config
>>
>> access is coming at the later point of time that is reason we added that
>> check.
>>
> What is accessing msix and config? Can you dump_stack() after noirq ops
> are called and figure out what is trying to access the bus when it is
> powered down?
The msix and config space is being accessed to mask interrupts. The
access is coming at the end of the suspend
and near CPU disable. We tried to dump the stack there but the call
stack is not coming as it is near cpu disable.
But we got dump at resume please have look at it
[ 54.946268] Enabling non-boot CPUs ...
[ 54.951182] CPU: 1 PID: 21 Comm: cpuhp/1 Not tainted 5.15.41 #105
43491e4414b1db8a6f59d56b617b520d92a9498e
[ 54.961122] Hardware name: Qualcomm Technologies, Inc. sc7280 IDP
SKU2 platform (DT)
[ 54.969088] Call trace:
[ 54.971612] dump_backtrace+0x0/0x200
[ 54.975399] show_stack+0x20/0x2c
[ 54.978826] dump_stack_lvl+0x6c/0x90
[ 54.982614] dump_stack+0x18/0x38
[ 54.986043] dw_msi_unmask_irq+0x2c/0x58
[ 54.990096] irq_enable+0x58/0x90
[ 54.993522] __irq_startup+0x68/0x94
[ 54.997216] irq_startup+0xf4/0x140
[ 55.000820] irq_affinity_online_cpu+0xc8/0x154
[ 55.005491] cpuhp_invoke_callback+0x19c/0x6e4
[ 55.010077] cpuhp_thread_fun+0x11c/0x188
[ 55.014216] smpboot_thread_fn+0x1ac/0x30c
[ 55.018445] kthread+0x140/0x30c
[ 55.021788] ret_from_fork+0x10/0x20
[ 55.028243] CPU1 is up
So the same stack should be called at the suspend path while disabling CPU.
If there is any other way to remove these calls can you please help us
point that way.
Thanks & Regards,
Krishna Chaitanya
Powered by blists - more mailing lists