lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240829053720.gmblrai2hkd73el3@thinkpad>
Date: Thu, 29 Aug 2024 11:07:20 +0530
From: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>
To: Bjorn Helgaas <helgaas@...nel.org>
Cc: lpieralisi@...nel.org, kw@...ux.com, bhelgaas@...gle.com,
	robh@...nel.org, linux-pci@...r.kernel.org,
	linux-kernel@...r.kernel.org, linux-arm-msm@...r.kernel.org,
	Dmitry Baryshkov <dmitry.baryshkov@...aro.org>
Subject: Re: [PATCH v2] PCI: qcom-ep: Enable controller resources like PHY
 only after refclk is available

On Wed, Aug 28, 2024 at 03:59:45PM -0500, Bjorn Helgaas wrote:
> On Wed, Aug 28, 2024 at 07:31:08PM +0530, Manivannan Sadhasivam wrote:
> > qcom_pcie_enable_resources() is called by qcom_pcie_ep_probe() and it
> > enables the controller resources like clocks, regulator, PHY. On one of the
> > new unreleased Qcom SoC, PHY enablement depends on the active refclk. And
> > on all of the supported Qcom endpoint SoCs, refclk comes from the host
> > (RC). So calling qcom_pcie_enable_resources() without refclk causes the
> > whole SoC crash on the new SoC.
> > 
> > qcom_pcie_enable_resources() is already called by
> > qcom_pcie_perst_deassert() when PERST# is deasserted, and refclk is
> > available at that time.
> > 
> > Hence, remove the unnecessary call to qcom_pcie_enable_resources() from
> > qcom_pcie_ep_probe() to prevent the crash.
> > 
> > Fixes: 869bc5253406 ("PCI: dwc: ep: Fix DBI access failure for drivers requiring refclk from host")
> > Tested-by: Dmitry Baryshkov <dmitry.baryshkov@...aro.org>
> > Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>
> > ---
> > 
> > Changes in v2:
> > 
> > - Changed the patch description to mention the crash clearly as suggested by
> >   Bjorn
> 
> Clearly mentioning the crash as rationale for the change is *part* of
> what I was looking for.
> 
> The rest, just as important, is information about what sort of crash
> this is, because I hope and suspect the crash is recoverable, and we
> *should* recover from it because PERST# may occur at arbitrary times,
> so trying to avoid it is never going to be reliable.
> 

I did mention 'whole SoC crash' which typically means unrecoverable state as
the SoC would crash (not just the driver). On Qcom SoCs, this will also lead the
SoC to boot into EDL (Emergency Download) mode so that the users can collect
dumps on the crash.

As I mentioned in earlier thread, I don't know how to avoid this crash entirely
(host asserting PERST# at random times) and still depend on refclk from host.
The best possible thing we can do is, at the time of PERST# assert, we can
notify the EPF driver to cancel all the work and not touch any registers that
require active refclk which is what the driver currently does.

And I'm also working on SRIS support which will allow the endpoint to generate
its own refclk and planning to make that mode as the default working mode.
Still the users could opt for non-SRIS mode (current mode of requiring refclk
from host) through DT.

- Mani

-- 
மணிவண்ணன் சதாசிவம்

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ