lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230609173940.GA1252506@bhelgaas>
Date:   Fri, 9 Jun 2023 12:39:40 -0500
From:   Bjorn Helgaas <helgaas@...nel.org>
To:     Siddharth Vadapalli <s-vadapalli@...com>
Cc:     tjoseph@...ence.com, lpieralisi@...nel.org, robh@...nel.org,
        kw@...ux.com, bhelgaas@...gle.com, nadeem@...ence.com,
        linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org,
        linux-arm-kernel@...ts.infradead.org, vigneshr@...com, srk@...com,
        nm@...com
Subject: Re: [PATCH v3] PCI: cadence: Fix Gen2 Link Retraining process

On Wed, Jun 07, 2023 at 02:44:27PM +0530, Siddharth Vadapalli wrote:
> The Link Retraining process is initiated to account for the Gen2 defect in
> the Cadence PCIe controller in J721E SoC. The errata corresponding to this
> is i2085, documented at:
> https://www.ti.com/lit/er/sprz455c/sprz455c.pdf
> 
> The existing workaround implemented for the errata waits for the Data Link
> initialization to complete and assumes that the link retraining process
> at the Physical Layer has completed. However, it is possible that the
> Physical Layer training might be ongoing as indicated by the
> PCI_EXP_LNKSTA_LT bit in the PCI_EXP_LNKSTA register.
> 
> Fix the existing workaround, to ensure that the Physical Layer training
> has also completed, in addition to the Data Link initialization.
> 
> Fixes: 4740b969aaf5 ("PCI: cadence: Retrain Link to work around Gen2 training defect")
> Signed-off-by: Siddharth Vadapalli <s-vadapalli@...com>
> Reviewed-by: Vignesh Raghavendra <vigneshr@...com>
> ---
> 
> Hello,
> 
> This patch is based on linux-next tagged next-20230606.
> 
> v2:
> https://lore.kernel.org/r/20230315070800.1615527-1-s-vadapalli@ti.com/
> Changes since v2:
> - Merge the cdns_pcie_host_training_complete() function with the
>   cdns_pcie_host_wait_for_link() function, as suggested by Bjorn
>   for the v2 patch.
> - Add dev_err() to notify when Link Training fails, since this is a
>   fatal error and proceeding from this point will almost always crash
>   the kernel.
> 
> v1:
> https://lore.kernel.org/r/20230102075656.260333-1-s-vadapalli@ti.com/
> Changes since v1:
> - Collect Reviewed-by tag from Vignesh Raghavendra.
> - Rebase on next-20230315.
> 
> Regards,
> Siddharth.
> 
>  .../controller/cadence/pcie-cadence-host.c    | 20 +++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/drivers/pci/controller/cadence/pcie-cadence-host.c b/drivers/pci/controller/cadence/pcie-cadence-host.c
> index 940c7dd701d6..70a5f581ff4f 100644
> --- a/drivers/pci/controller/cadence/pcie-cadence-host.c
> +++ b/drivers/pci/controller/cadence/pcie-cadence-host.c
> @@ -12,6 +12,8 @@
>  
>  #include "pcie-cadence.h"
>  
> +#define LINK_RETRAIN_TIMEOUT HZ
> +
>  static u64 bar_max_size[] = {
>  	[RP_BAR0] = _ULL(128 * SZ_2G),
>  	[RP_BAR1] = SZ_2G,
> @@ -80,8 +82,26 @@ static struct pci_ops cdns_pcie_host_ops = {
>  static int cdns_pcie_host_wait_for_link(struct cdns_pcie *pcie)
>  {
>  	struct device *dev = pcie->dev;
> +	unsigned long end_jiffies;
> +	u16 link_status;
>  	int retries;
>  
> +	/* Wait for link training to complete */
> +	end_jiffies = jiffies + LINK_RETRAIN_TIMEOUT;
> +	do {
> +		link_status = cdns_pcie_rp_readw(pcie, CDNS_PCIE_RP_CAP_OFFSET + PCI_EXP_LNKSTA);
> +		if (!(link_status & PCI_EXP_LNKSTA_LT))
> +			break;
> +		usleep_range(0, 1000);
> +	} while (time_before(jiffies, end_jiffies));
> +
> +	if (!(link_status & PCI_EXP_LNKSTA_LT)) {
> +		dev_info(dev, "Link training complete\n");
> +	} else {
> +		dev_err(dev, "Fatal! Link training incomplete\n");
> +		return -ETIMEDOUT;
> +	}

Can I have a brown paper bag, please?  I totally blew it here, and I'm
sorry.

You took my advice by combining this with the existing
cdns_pcie_host_wait_for_link(), but I think my advice was poor because
(a) now this additional wait is not clearly connected with the
erratum, and (b) it affects devices that don't have the erratum.

IIUC, this is all part of a workaround for the i2085 erratum.  The
original workaround, 4740b969aaf5 ("PCI: cadence: Retrain Link to work
around Gen2 training defect"), added this:

  if (!ret && rc->quirk_retrain_flag)
    ret = cdns_pcie_retrain(pcie);

I think the wait for link train to complete should also be in
cdns_pcie_retrain() so it's clearly connected with the quirk, which
also means we'd only do the wait for devices with the erratum.

Which is EXACTLY what your first patch did, and I missed it.  I am
very sorry.  I guess maybe I thought cdns_pcie_retrain() was a
general-purpose thing, but in fact it's only used for this quirk.

Bjorn

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ