lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240821214312.GA270533@bhelgaas>
Date: Wed, 21 Aug 2024 16:43:12 -0500
From: Bjorn Helgaas <helgaas@...nel.org>
To: Krzysztof Wilczyński <kw@...ux.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>,
	lpieralisi@...nel.org, robh@...nel.org, bhelgaas@...gle.com,
	linux-arm-msm@...r.kernel.org, linux-pci@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] PCI: qcom-ep: Move controller cleanups to
 qcom_pcie_perst_deassert()

On Wed, Aug 14, 2024 at 05:28:37AM +0900, Krzysztof Wilczyński wrote:
> > Currently, the endpoint cleanup function dw_pcie_ep_cleanup() and EPF
> > deinit notify function pci_epc_deinit_notify() are called during the
> > execution of qcom_pcie_perst_assert() i.e., when the host has asserted
> > PERST#. But quickly after this step, refclk will also be disabled by the
> > host.
> > 
> > All of the Qcom endpoint SoCs supported as of now depend on the refclk from
> > the host for keeping the controller operational. Due to this limitation,
> > any access to the hardware registers in the absence of refclk will result
> > in a whole endpoint crash. Unfortunately, most of the controller cleanups
> > require accessing the hardware registers (like eDMA cleanup performed in
> > dw_pcie_ep_cleanup(), powering down MHI EPF etc...). So these cleanup
> > functions are currently causing the crash in the endpoint SoC once host
> > asserts PERST#.
> > 
> > One way to address this issue is by generating the refclk in the endpoint
> > itself and not depending on the host. But that is not always possible as
> > some of the endpoint designs do require the endpoint to consume refclk from
> > the host (as I was told by the Qcom engineers).
> > 
> > So let's fix this crash by moving the controller cleanups to the start of
> > the qcom_pcie_perst_deassert() function. qcom_pcie_perst_deassert() is
> > called whenever the host has deasserted PERST# and it is guaranteed that
> > the refclk would be active at this point. So at the start of this function,
> > the controller cleanup can be performed. Once finished, rest of the code
> > execution for PERST# deassert can continue as usual.
> 
> Applied to controller/qcom, thank you!
> 
> [1/1] PCI: qcom-ep: Move controller cleanups to qcom_pcie_perst_deassert()
>       https://git.kernel.org/pci/pci/c/6960cdc1ef97

I dropped this for now, looking for a new simpler version without
"cleanup_pending" and a similar change for tegra194 (separate patch).

I think it's still an open question whether both
pci_epc_deinit_notify() and pci_epc_init_notify() are needed, but that
should be separate and I don't think that would fix a crash.

You said this was not strictly v6.11 material, but it does fix a
crash, and it only touches the endpoint driver, so ... it seems like a
possible candidate, especially if we can identify a recent commit that
caused the crash.

Bjorn

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ