[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID:
<BY3PR18MB46738F5857319F9637FA5050A7F12@BY3PR18MB4673.namprd18.prod.outlook.com>
Date: Fri, 7 Feb 2025 18:46:22 +0000
From: Wilson Ding <dingwei@...vell.com>
To: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>,
"cassel@...nel.org" <cassel@...nel.org>
CC: Bjorn Helgaas <helgaas@...nel.org>,
"lpieralisi@...nel.org"
<lpieralisi@...nel.org>,
"thomas.petazzoni@...tlin.com"
<thomas.petazzoni@...tlin.com>,
"kw@...ux.com" <kw@...ux.com>, "robh@...nel.org" <robh@...nel.org>,
"bhelgaas@...gle.com"
<bhelgaas@...gle.com>,
"linux-pci@...r.kernel.org"
<linux-pci@...r.kernel.org>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>,
"linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>,
Sanghoon Lee <salee@...vell.com>
Subject: RE: [EXTERNAL] Re: [PATCH 1/1] PCI: armada8k: Add link-down handle
> -----Original Message-----
> From: Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>
> Sent: Friday, February 7, 2025 9:58 AM
> To: Wilson Ding <dingwei@...vell.com>; cassel@...nel.org
> Cc: Bjorn Helgaas <helgaas@...nel.org>; lpieralisi@...nel.org;
> thomas.petazzoni@...tlin.com; kw@...ux.com; robh@...nel.org;
> bhelgaas@...gle.com; linux-pci@...r.kernel.org; linux-arm-
> kernel@...ts.infradead.org; linux-kernel@...r.kernel.org; Sanghoon Lee
> <salee@...vell.com>
> Subject: [EXTERNAL] Re: [PATCH 1/1] PCI: armada8k: Add link-down handle
>
> + Niklas (who was interested in link down handling) On Sat, Feb 01, 2025
> + at 11: 05: 56PM +0000, Wilson Ding wrote: > > On November 13, 2024 3:
> + 02: 55 AM GMT+05: 30, Bjorn Helgaas > > <mailto: helgaas@ kernel. org>
> + wrote: > >
>
> + Niklas (who was interested in link down handling)
>
> On Sat, Feb 01, 2025 at 11:05:56PM +0000, Wilson Ding wrote:
> > > On November 13, 2024 3:02:55 AM GMT+05:30, Bjorn Helgaas
> > > <mailto:helgaas@...nel.org> wrote:
> > > >In subject:
> > > >
> > > > PCI: armada8k: Add link-down handling
> > > >
> > > >On Mon, Nov 11, 2024 at 10:48:13PM -0800, Jenishkumar Maheshbhai
> > > Patel wrote:
> > > >> In PCIE ISR routine caused by RST_LINK_DOWN we schedule work to
> > > >> handle the link-down procedure.
> > > >> Link-down procedure will:
> > > >> 1. Remove PCIe bus
> > > >> 2. Reset the MAC
> > > >> 3. Reconfigure link back up
> > > >> 4. Rescan PCIe bus
> > > >
> > > >s/PCIE/PCIe/
> > > >
> > > >Rewrap to fill 75 columns.
> > > >
> > > >I assume this basically removes a Root Port (and the hierarchy
> > > >below
> > > >it) if the link goes down, and then resets the MAC and tries to
> > > >bring up the link and enumerate the hierarchy again.
> > > >
> > > >No other drivers do this, so why does armada8k need it? Is this to
> > > >work around some unreliable link?
> > >
> > > Certainly Qcom IPs have this same feature and I was also looking to
> > > implement it. But the link down should not be handled by this in the
> controller driver.
> > >
> > > Instead, it should be tied to bus reset in the core and the reset
> > > should be done through a callback implemented in the controller
> > > drivers. This way, the reset cannot happen in the back of PCI core and client
> drivers.
> > >
> > > That said, the Link down IRQ received by this driver should also be
> > > propagated back to the PCI core and the core should then call the
> > > callback to reset the bus that I mentioned above.
> > >
> >
> > It's more than a work-around for the unreliable link. A few customers
> > may have such application - independent power supply to the device
> > with dedicated reset GPIO to #PRST. In this way, the power cycle and
> > warm reset of RC and EP won't have impact on each other. However, it
> > may lead into the PCI driver not aware of the link down when an unexpected
> power down or reset occurs on the device.
> > We cannot assume the link will be recovered soon. The worse thing is
> > the driver may continue access to the device, which may hang the bus.
> > Since the device is no longer present on the bus, it's better to
> > remove it. Besides, in order to bring up the link, the only way is to
> > reset the MAC, which starts over the state machine of LTSSM.
> >
> > Well, we also noticed that there is no other driver that did this. I
> > agree it is not necessary if the power cycle or warm reset of the
> > device is done gracefully. The user can remove the device prior to the
> > power cycle/reset. And do the rescan after the link is recovered. However,
> the unexpected power down is still possible.
> > Please enlighten me if there is any better approach to handle such
> > unexpected link down.
> >
>
> There is no issue in retraining the link. My concern is that, the retrain should
> not happen autonomously in the controller driver. PCI core should be made
> informed of it. More below.
>
Do you mean
- pass the link down/up events to PCI core
- remove the device or hierarchy by PCI core upon link down
- initiate the link retraining in PCI core by calling the platform retrain callbacks
- rescan the bus once link is recovered
> > In the meanwhile, I am quite interested the callback implementation
> > suggested by Mani. But I am sure if we have such infrastructure right
> > there. Mani, could you please elaborate a bit more, or is there any
> > examples in the existing code and patches.
> >
>
> There is no such implementation available right now. This idea is on my mind
> for quite some time, but never had time to do it. Maybe this gives me
> motivation to do so.
>
> Niklas: Did you get a chance to look into this? Else, let me know. I'll take a stab
> at it.
>
> - Mani
>
> --
> மணிவண்ணன் சதாசிவம்
Powered by blists - more mailing lists