linux-kernel - Re: [PATCH v5 2/3] PCI: qcom: Restrict pci transactions after pci suspend

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <81dcbf72-92bb-093a-da48-89a73ead820e@quicinc.com>
Date:   Thu, 25 Aug 2022 19:22:43 +0530
From:   Krishna Chaitanya Chundru <quic_krichai@...cinc.com>
To:     Stephen Boyd <swboyd@...omium.org>, <helgaas@...nel.org>
CC:     <linux-pci@...r.kernel.org>, <linux-arm-msm@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>, <mka@...omium.org>,
        <quic_vbadigan@...cinc.com>, <quic_hemantk@...cinc.com>,
        <quic_nitegupt@...cinc.com>, <quic_skananth@...cinc.com>,
        <quic_ramkri@...cinc.com>, <manivannan.sadhasivam@...aro.org>,
        <dmitry.baryshkov@...aro.org>, Jingoo Han <jingoohan1@...il.com>,
        "Gustavo Pimentel" <gustavo.pimentel@...opsys.com>,
        Lorenzo Pieralisi <lpieralisi@...nel.org>,
        Rob Herring <robh@...nel.org>,
        Krzysztof Wilczyński <kw@...ux.com>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        Andy Gross <agross@...nel.org>,
        Bjorn Andersson <bjorn.andersson@...aro.org>,
        Stanimir Varbanov <svarbanov@...sol.com>
Subject: Re: [PATCH v5 2/3] PCI: qcom: Restrict pci transactions after pci
 suspend


On 8/24/2022 10:50 PM, Stephen Boyd wrote:
> Quoting Krishna Chaitanya Chundru (2022-08-23 20:37:59)
>> On 8/9/2022 12:42 AM, Stephen Boyd wrote:
>>> Quoting Krishna chaitanya chundru (2022-08-03 04:28:53)
>>>> If the endpoint device state is D0 and irq's are not freed, then
>>>> kernel try to mask interrupts in system suspend path by writing
>>>> in to the vector table (for MSIX interrupts) and config space (for MSI's).
>>>>
>>>> These transactions are initiated in the pm suspend after pcie clocks got
>>>> disabled as part of platform driver pm  suspend call. Due to it, these
>>>> transactions are resulting in un-clocked access and eventually to crashes.
>>> Why are the platform driver pm suspend calls disabling clks that early?
>>> Can they disable clks in noirq phase, or even later, so that we don't
>>> have to check if the device is clocking in the irq poking functions?
>>> It's best to keep irq operations fast, so that irq control is fast given
>>> that these functions are called from irq flow handlers.
>> We are registering the pcie pm suspend ops as noirq ops only. And this
>> msix and config
>>
>> access is coming at the later point of time that is reason we added that
>> check.
>>
> What is accessing msix and config? Can you dump_stack() after noirq ops
> are called and figure out what is trying to access the bus when it is
> powered down?

The msix and config space is being accessed to mask interrupts. The 
access is coming at the end of the suspend

and near CPU disable. We tried to dump the stack there but the call 
stack is not coming as it is near cpu disable.

But we got dump at resume please have look at it

[   54.946268] Enabling non-boot CPUs ...
[   54.951182] CPU: 1 PID: 21 Comm: cpuhp/1 Not tainted 5.15.41 #105 
43491e4414b1db8a6f59d56b617b520d92a9498e
[   54.961122] Hardware name: Qualcomm Technologies, Inc. sc7280 IDP 
SKU2 platform (DT)
[   54.969088] Call trace:
[   54.971612]  dump_backtrace+0x0/0x200
[   54.975399]  show_stack+0x20/0x2c
[   54.978826]  dump_stack_lvl+0x6c/0x90
[   54.982614]  dump_stack+0x18/0x38
[   54.986043]  dw_msi_unmask_irq+0x2c/0x58
[   54.990096]  irq_enable+0x58/0x90
[   54.993522]  __irq_startup+0x68/0x94
[   54.997216]  irq_startup+0xf4/0x140
[   55.000820]  irq_affinity_online_cpu+0xc8/0x154
[   55.005491]  cpuhp_invoke_callback+0x19c/0x6e4
[   55.010077]  cpuhp_thread_fun+0x11c/0x188
[   55.014216]  smpboot_thread_fn+0x1ac/0x30c
[   55.018445]  kthread+0x140/0x30c
[   55.021788]  ret_from_fork+0x10/0x20
[   55.028243] CPU1 is up

So the same stack should be called at the suspend path while disabling CPU.

If there is any other way to remove these calls can you please help us 
point that way.

Thanks & Regards,
Krishna Chaitanya