[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ac48d142-7aec-4fdd-92a4-6f9bc10a7928@rock-chips.com>
Date: Fri, 18 Jul 2025 09:55:42 +0800
From: Shawn Lin <shawn.lin@...k-chips.com>
To: Geraldo Nascimento <geraldogabriel@...il.com>
Cc: shawn.lin@...k-chips.com, Hugh Cole-Baker <sigmaris@...il.com>,
Lorenzo Pieralisi <lpieralisi@...nel.org>,
Krzysztof Wilczyński <kw@...ux.com>,
Manivannan Sadhasivam <manivannan.sadhasivam@...aro.org>,
Rob Herring <robh@...nel.org>, Bjorn Helgaas <bhelgaas@...gle.com>,
Heiko Stuebner <heiko@...ech.de>, linux-pci@...r.kernel.org,
linux-arm-kernel@...ts.infradead.org, linux-kernel@...r.kernel.org,
linux-rockchip@...ts.infradead.org
Subject: Re: [RFC PATCH v3 2/3] PCI: rockchip-host: Retry link training on
failure without PERST#
Hi Geraldo,
在 2025/06/11 星期三 3:05, Geraldo Nascimento 写道:
> After almost 30 days of battling with RK3399 buggy PCIe on my Rock Pi
> N10 through trial-and-error debugging, I finally got positive results
> with enumeration on the PCI bus for both a Realtek 8111E NIC and a
> Samsung PM981a SSD.
>
> The NIC was connected to a M.2->PCIe x4 riser card and it would get
> stuck on Polling.Compliance, without breaking electrical idle on the
> Host RX side. The Samsung PM981a SSD is directly connected to M.2
> connector and that SSD is known to be quirky (OEM... no support)
> and non-functional on the RK3399 platform.
>
> The Samsung SSD was even worse than the NIC - it would get stuck on
> Detect.Active like a bricked card, even though it was fully functional
> via USB adapter.
>
> It seems both devices benefit from retrying Link Training if - big if
> here - PERST# is not toggled during retry.
>
I didn't see this error before especially given RTL8111 NIC is widelly
used by customers.
Could you help tried this?
[1] apply your patch 3 first
[2] apply below changes
--- a/drivers/pci/controller/pcie-rockchip-host.c
+++ b/drivers/pci/controller/pcie-rockchip-host.c
@@ -314,7 +314,7 @@ static int rockchip_pcie_host_init_port(struct
rockchip_pcie *rockchip)
rockchip_pcie_write(rockchip, PCIE_CLIENT_LINK_TRAIN_ENABLE,
PCIE_CLIENT_CONFIG);
- msleep(PCIE_T_PVPERL_MS);
+ msleep(500);
gpiod_set_value_cansleep(rockchip->perst_gpio, 1);
msleep(PCIE_RESET_CONFIG_WAIT_MS);
@@ -322,7 +322,7 @@ static int rockchip_pcie_host_init_port(struct
rockchip_pcie *rockchip)
/* 500ms timeout value should be enough for Gen1/2 training */
err = readl_poll_timeout(rockchip->apb_base +
PCIE_CLIENT_BASIC_STATUS1,
status, PCIE_LINK_UP(status), 20,
- 500 * USEC_PER_MSEC);
+ 5000 * USEC_PER_MSEC);
if (err) {
dev_err(dev, "PCIe link training gen1 timeout!\n");
goto err_power_off_phy;
@@ -951,6 +951,8 @@ static int rockchip_pcie_probe(struct
platform_device *pdev)
if (err)
return err;
+ gpiod_set_value_cansleep(rockchip->perst_gpio, 0);
+
err = rockchip_pcie_set_vpcie(rockchip);
if (err) {
dev_err(dev, "failed to set vpcie regulator\n");
> For retry to work, flow must be exactly as handled by present patch,
> that is, we must cut power, disable the clocks, then re-enable
> both clocks and power regulators and go through initialization
> without touching PERST#. Then quirky devices are able to sucessfully
> enumerate.
>
> No functional change intended for already working devices.
>
> Signed-off-by: Geraldo Nascimento <geraldogabriel@...il.com>
> ---
> drivers/pci/controller/pcie-rockchip-host.c | 47 ++++++++++++++++++---
> 1 file changed, 40 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/pci/controller/pcie-rockchip-host.c b/drivers/pci/controller/pcie-rockchip-host.c
> index 2a1071cd3241..67b3b379d277 100644
> --- a/drivers/pci/controller/pcie-rockchip-host.c
> +++ b/drivers/pci/controller/pcie-rockchip-host.c
> @@ -338,11 +338,14 @@ static int rockchip_pcie_set_vpcie(struct rockchip_pcie *rockchip)
> static int rockchip_pcie_host_init_port(struct rockchip_pcie *rockchip)
> {
> struct device *dev = rockchip->dev;
> - int err, i = MAX_LANE_NUM;
> + int err, i = MAX_LANE_NUM, is_reinit = 0;
> u32 status;
>
> - gpiod_set_value_cansleep(rockchip->perst_gpio, 0);
> + if (!is_reinit) {
> + gpiod_set_value_cansleep(rockchip->perst_gpio, 0);
> + }
>
> +reinit:
> err = rockchip_pcie_init_port(rockchip);
> if (err)
> return err;
> @@ -369,16 +372,46 @@ static int rockchip_pcie_host_init_port(struct rockchip_pcie *rockchip)
> rockchip_pcie_write(rockchip, PCIE_CLIENT_LINK_TRAIN_ENABLE,
> PCIE_CLIENT_CONFIG);
>
> - msleep(PCIE_T_PVPERL_MS);
> - gpiod_set_value_cansleep(rockchip->perst_gpio, 1);
> -
> - msleep(PCIE_T_RRS_READY_MS);
> + if (!is_reinit) {
> + msleep(PCIE_T_PVPERL_MS);
> + gpiod_set_value_cansleep(rockchip->perst_gpio, 1);
> + msleep(PCIE_T_RRS_READY_MS);
> + }
>
> /* 500ms timeout value should be enough for Gen1/2 training */
> err = readl_poll_timeout(rockchip->apb_base + PCIE_CLIENT_BASIC_STATUS1,
> status, PCIE_LINK_UP(status), 20,
> 500 * USEC_PER_MSEC);
> - if (err) {
> +
> + if (err && !is_reinit) {
> + while (i--)
> + phy_power_off(rockchip->phys[i]);
> + i = MAX_LANE_NUM;
> + while (i--)
> + phy_exit(rockchip->phys[i]);
> + i = MAX_LANE_NUM;
> + is_reinit = 1;
> + dev_dbg(dev, "Will reinit PCIe without toggling PERST#");
> + if (!IS_ERR(rockchip->vpcie12v))
> + regulator_disable(rockchip->vpcie12v);
> + if (!IS_ERR(rockchip->vpcie3v3))
> + regulator_disable(rockchip->vpcie3v3);
> + regulator_disable(rockchip->vpcie1v8);
> + regulator_disable(rockchip->vpcie0v9);
> + rockchip_pcie_disable_clocks(rockchip);
> + err = rockchip_pcie_enable_clocks(rockchip);
> + if (err)
> + return err;
> + err = rockchip_pcie_set_vpcie(rockchip);
> + if (err) {
> + dev_err(dev, "failed to set vpcie regulator\n");
> + rockchip_pcie_disable_clocks(rockchip);
> + return err;
> + }
> + goto reinit;
> + }
> +
> + else if (err) {
> dev_err(dev, "PCIe link training gen1 timeout!\n");
> goto err_power_off_phy;
> }
Powered by blists - more mailing lists