[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240829164455.ts2j46dfxwp3pa2f@thinkpad>
Date: Thu, 29 Aug 2024 22:14:55 +0530
From: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>
To: Bjorn Helgaas <helgaas@...nel.org>
Cc: lpieralisi@...nel.org, kw@...ux.com, bhelgaas@...gle.com,
robh@...nel.org, linux-pci@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-arm-msm@...r.kernel.org,
Dmitry Baryshkov <dmitry.baryshkov@...aro.org>
Subject: Re: [PATCH v2] PCI: qcom-ep: Enable controller resources like PHY
only after refclk is available
On Thu, Aug 29, 2024 at 07:38:08AM -0500, Bjorn Helgaas wrote:
> On Thu, Aug 29, 2024 at 11:07:20AM +0530, Manivannan Sadhasivam wrote:
> > On Wed, Aug 28, 2024 at 03:59:45PM -0500, Bjorn Helgaas wrote:
> > > On Wed, Aug 28, 2024 at 07:31:08PM +0530, Manivannan Sadhasivam wrote:
> > > > qcom_pcie_enable_resources() is called by qcom_pcie_ep_probe() and it
> > > > enables the controller resources like clocks, regulator, PHY. On one of the
> > > > new unreleased Qcom SoC, PHY enablement depends on the active refclk. And
> > > > on all of the supported Qcom endpoint SoCs, refclk comes from the host
> > > > (RC). So calling qcom_pcie_enable_resources() without refclk causes the
> > > > whole SoC crash on the new SoC.
> > > >
> > > > qcom_pcie_enable_resources() is already called by
> > > > qcom_pcie_perst_deassert() when PERST# is deasserted, and refclk is
> > > > available at that time.
> > > >
> > > > Hence, remove the unnecessary call to qcom_pcie_enable_resources() from
> > > > qcom_pcie_ep_probe() to prevent the crash.
> > > >
> > > > Fixes: 869bc5253406 ("PCI: dwc: ep: Fix DBI access failure for drivers requiring refclk from host")
> > > > Tested-by: Dmitry Baryshkov <dmitry.baryshkov@...aro.org>
> > > > Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>
> > > > ---
> > > >
> > > > Changes in v2:
> > > >
> > > > - Changed the patch description to mention the crash clearly as suggested by
> > > > Bjorn
> > >
> > > Clearly mentioning the crash as rationale for the change is *part* of
> > > what I was looking for.
> > >
> > > The rest, just as important, is information about what sort of crash
> > > this is, because I hope and suspect the crash is recoverable, and we
> > > *should* recover from it because PERST# may occur at arbitrary times,
> > > so trying to avoid it is never going to be reliable.
> >
> > I did mention 'whole SoC crash' which typically means unrecoverable
> > state as the SoC would crash (not just the driver). On Qcom SoCs,
> > this will also lead the SoC to boot into EDL (Emergency Download)
> > mode so that the users can collect dumps on the crash.
>
> IIUC we're talking about an access to a PHY register, and the access
> requires Refclk from the host. I assume the SoC accesses the register
> by doing an MMIO load. If nothing responds, I assume the SoC would
> take a machine check or similar because there's no data to complete
> the load instruction. So I assume again that the Linux on the SoC
> doesn't know how to recover from such a machine check? If that's the
> scenario, is the machine check unrecoverable in principle, or is it
> potentially recoverable but nobody has done the work to do it? My
> guess would be the latter, because the former would mean that it's
> impossible to build a robust endpoint around this SoC. But obviously
> this is all complete speculation on my part.
>
Atleast on Qcom SoCs, doing a MMIO read without enabling the resources would
result in a NoC (Network On Chip) error, which then end up as an exception to
the Trustzone and Trustzone will finally convert it to a SoC crash so that the
users could take a crash dump and do the analysis on why the crash has happened.
I know that it may sound strange to developers coming from x86 world :)
But this NoC error is something NVidia has also reported before, so I wouldn't
assume that this is a Qcom specific issue but rather for SoCs depending on
refclk from host.
For building a robust endpoint, SoCs should generate refclk by themselves.
- Mani
--
மணிவண்ணன் சதாசிவம்
Powered by blists - more mailing lists