[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <zuiq3b2rsixymtjr3xzrb26clikvlja62wgj65umnse4kuk75c@x5qan73ispxe>
Date: Mon, 23 Jun 2025 05:29:46 -0600
From: Manivannan Sadhasivam <mani@...nel.org>
To: Geraldo Nascimento <geraldogabriel@...il.com>
Cc: linux-rockchip@...ts.infradead.org,
Hugh Cole-Baker <sigmaris@...il.com>, Shawn Lin <shawn.lin@...k-chips.com>,
Lorenzo Pieralisi <lpieralisi@...nel.org>, Krzysztof Wilczyński <kw@...ux.com>,
Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>, Rob Herring <robh@...nel.org>,
Bjorn Helgaas <bhelgaas@...gle.com>, Heiko Stuebner <heiko@...ech.de>, linux-pci@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [RFC PATCH v3 2/3] PCI: rockchip-host: Retry link training on
failure without PERST#
On Tue, Jun 10, 2025 at 04:05:40PM -0300, Geraldo Nascimento wrote:
> After almost 30 days of battling with RK3399 buggy PCIe on my Rock Pi
> N10 through trial-and-error debugging, I finally got positive results
> with enumeration on the PCI bus for both a Realtek 8111E NIC and a
> Samsung PM981a SSD.
>
> The NIC was connected to a M.2->PCIe x4 riser card and it would get
> stuck on Polling.Compliance, without breaking electrical idle on the
> Host RX side. The Samsung PM981a SSD is directly connected to M.2
> connector and that SSD is known to be quirky (OEM... no support)
> and non-functional on the RK3399 platform.
>
> The Samsung SSD was even worse than the NIC - it would get stuck on
> Detect.Active like a bricked card, even though it was fully functional
> via USB adapter.
>
> It seems both devices benefit from retrying Link Training if - big if
> here - PERST# is not toggled during retry.
>
> For retry to work, flow must be exactly as handled by present patch,
> that is, we must cut power, disable the clocks, then re-enable
> both clocks and power regulators and go through initialization
> without touching PERST#. Then quirky devices are able to sucessfully
> enumerate.
>
This sounds weird. PERST# is just an indication to the device that the power and
refclk are applied or going to be removed. The devices uses PERST# to prepare
for the power removal during assert and start functioning after deassert.
It looks like the PERST# polarity is inverted in your case. Could you please
change the 'ep-gpios' polarity to GPIO_ACTIVE_LOW and see if it fixes the issue
without this patch?
If that didn't work, could you please drop the 'ep-gpios' property and check?
> No functional change intended for already working devices.
>
> Signed-off-by: Geraldo Nascimento <geraldogabriel@...il.com>
> ---
> drivers/pci/controller/pcie-rockchip-host.c | 47 ++++++++++++++++++---
> 1 file changed, 40 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/pci/controller/pcie-rockchip-host.c b/drivers/pci/controller/pcie-rockchip-host.c
> index 2a1071cd3241..67b3b379d277 100644
> --- a/drivers/pci/controller/pcie-rockchip-host.c
> +++ b/drivers/pci/controller/pcie-rockchip-host.c
> @@ -338,11 +338,14 @@ static int rockchip_pcie_set_vpcie(struct rockchip_pcie *rockchip)
> static int rockchip_pcie_host_init_port(struct rockchip_pcie *rockchip)
> {
> struct device *dev = rockchip->dev;
> - int err, i = MAX_LANE_NUM;
> + int err, i = MAX_LANE_NUM, is_reinit = 0;
> u32 status;
>
> - gpiod_set_value_cansleep(rockchip->perst_gpio, 0);
> + if (!is_reinit) {
> + gpiod_set_value_cansleep(rockchip->perst_gpio, 0);
> + }
>
> +reinit:
So this reinit part only skips the PERST# assert, but calls
rockchip_pcie_init_port() which resets the Root Port including PHY. I don't
think it is safe to do it if PERST# is wired.
- Mani
--
மணிவண்ணன் சதாசிவம்
Powered by blists - more mailing lists