lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.21.2209061237270.2275@angie.orcam.me.uk>
Date:   Sat, 17 Sep 2022 13:04:11 +0100 (BST)
From:   "Maciej W. Rozycki" <macro@...am.me.uk>
To:     Bjorn Helgaas <helgaas@...nel.org>
cc:     Bjorn Helgaas <bhelgaas@...gle.com>, Stefan Roese <sr@...x.de>,
        Jim Wilson <wilson@...iptree.org>,
        David Abdurachmanov <david.abdurachmanov@...il.com>,
        linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4] pci: Work around ASMedia ASM2824 PCIe link training
 failures

Hi Bjorn,

On Thu, 21 Apr 2022, Bjorn Helgaas wrote:

> Note the subject line conventions.  I usually fix things like that
> silently, but it saves me time if I don't have to.

 Fixed.  Not sure why I missed it, I usually capitalise subsystem names 
properly and check the usual convention beforehand anyway.  Sorry about 
it.

> > It has been observed with a downstream port of the ASMedia ASM2824 Gen 3 
> > switch wired to the upstream port of the Pericom PI7C9X2G304 Gen 2 
> > switch, using a Delock Riser Card PCI Express x1 > 2 x PCIe x1 device, 
> > P/N 41433, wired to a SiFive HiFive Unmatched board.  In this setup the 
> > switches are supposed to negotiate the link speed of preferably 5.0GT/s, 
> > falling back to 2.5GT/s.
> 
> Have you found any reports of issues other than on the SiFive HiFive
> Unmatched?  I poked around a little and didn't see any.  I considered
> suggesting that this go in arch/riscv unless we see the problem
> elsewhere, but I guess it's maybe not worth that.

 I haven't seen any report, but it doesn't mean nobody has triggered it.  

 It shouldn't matter anyway.  Both devices exist in the form of an option 
card each, sold under different brands, so if someone is creative enough 
and has such a need, then they can use them together in any system that 
has PCIe connectivity.

 The ASMedia ASM2824 switch is used with the StarTech PEX8M2E2 dual M.2 
M-Key adapter.  Also sold under the Ableconn brand as PEXM2-130.  And M.2 
M-Key to PCIe slot adapters are widely available.

 The Pericom PI7C9X2G304 switch is used with the Delock 41433 dual PCIe 
slot adapter.  Also sold under the SYBA IOCrest brand as SI-PEX60016.  
They have both been withdrawn AFAICT now, so availability may vary though.

 Therefore limiting the workaround to arch/riscv doesn't seem a terribly 
good idea to me.  You can plug these option cards into any system.

 Also the nature of the erratum is not fully understood and while it 
cannot trigger in the reverse configuration with these specific ICs (i.e. 
where the PI7C9X2G304 is upstream and the ASM2824 is downstream) because 
PI7C9X2G304's downstream ports are 2.5GT/s only, I suspect it may with a 
combination of other ASMedia and Pericom devices in either hierarchy order 
or possibly other switches.  Such devices continue being widely available.

 Therefore after some thinking about the possible consequences I chose to 
mimic as far as possible the approach I already took with the analogous 
U-Boot workaround and relax the vendor:device ID check so that an attempt 
to retrain is applied to all stuck devices that have a downstream link.  

 Since unlike U-Boot we cannot busy-loop with interrupts disabled polling 
the Link Training bit the workaround is limited to devices capable of 
reporting Data Link Layer Link Active status.  According to the PCIe spec 
that should only exclude some 5.0GT/s devices, because all 8.0GT/s and 
higher speed devices are required to support the feature (though I can 
imagine that reality might disagree, just as this issue shouldn't have 
been made possible either).

 Then for safety if retraining succeeds at 2.5GT/s, then the speed cap is 
only removed for known-good devices, currently the ASM2824 only.

> Interesting that there's a PI7C9X2G304 involved, since
> quirk_enable_clear_retrain_link() is also for Pericom devices.
> But that relates to Downstream Ports, and the PI7C9X2G304 is the
> Upstream Port in this case, so I suppose it's just coincidental.

 It's unrelated however in that it's merely a Retrain Link bit erratum 
rather than an issue with initial training.  Thank you for pointing this 
out though as I had to factor this in in my now more generic updated code.

> >  Do you need any further information beyond what I provided in: 
> > <https://lore.kernel.org/lkml/alpine.DEB.2.21.2203022037020.56670@angie.orcam.me.uk/>?
> 
> The commit log has more detail than necessary, so the specifics of
> what the quirk does (basically the last paragraph) get lost.  This
> link could be moved to the commit log for the background.  You
> previously included a URL for a u-boot change; can you include that
> URL as well?

 Sure, though background information can be easily lost and is best kept 
together with the change itself.  Since the necessary details are already 
in the comment at `pcie_downstream_link_retrain' though I've trimmed the 
commit message now as you requested.

> > +static void pcie_downstream_link_retrain(struct pci_dev *dev)
> > +{
> > +	u16 lnksta, lnkctl2;
> > +	u8 pos;
> > +
> > +	pos = pci_find_capability(dev, PCI_CAP_ID_EXP);
> > +	WARN_ON(!pos);
> 
> Doesn't seem worth warning about to me.

 And not relevant in v5 anymore where a check for the device being PCIe 
is used instead, thus necessarily removed.

> > +	if (!pos)
> > +		return;
> > +
> > +	pci_read_config_word(dev, pos + PCI_EXP_LNKCTL2, &lnkctl2);
> > +	pci_read_config_word(dev, pos + PCI_EXP_LNKSTA, &lnksta);
> 
> Use pci_is_pcie(), pcie_capability_read_word(), etc.

 Right, I wasn't aware about these helpers.  Fixed.

> > +		pci_info(dev, "retraining %s!\n",
> > +			 lnksta & PCI_EXP_LNKSTA_DLLLA ?
> > +			 "succeeded" : "failed");
> 
> If retraining failed, I think we should just return here.  We will
> skip the code below anyway, but this would be more obvious:
> 
>   if (!(lnksta & PCI_EXP_LNKSTA_DLLLA)) {
>     pci_inf(..., "failed");
>     return;
>   }
> 
> I don't think it's really necessary to log success.

 Fine with me, applied, removing the ellipsis from the preceding message 
accordingly.

> > +		pci_info(dev, "removing 2.5GT/s downstream link speed restriction\n");
> > +		pci_read_config_word(dev, pos + PCI_EXP_LNKCTL, &lnkctl);
> > +		pci_read_config_dword(dev, pos + PCI_EXP_LNKCAP, &lnkcap);
> > +		lnkctl |= PCI_EXP_LNKCTL_RL;
> > +		lnkctl2 &= ~PCI_EXP_LNKCTL2_TLS;
> > +		lnkctl2 |= lnkcap & PCI_EXP_LNKCAP_SLS;
> 
> We rely on PCI_EXP_LNKCAP_SLS and PCI_EXP_LNKCTL2_TLS having the same
> encoding in the same bits.  Enough to force a reviewer to
> double-check, but it looks like it *is* safe.

 It is the intent of the spec AFAICT and all these fields (including 
PCI_EXP_LNKSTA_CLS too) refer to the Supported Link Speeds Vector 
(PCI_EXP_LNKCAP2_SLS).  Thanks for double-checking.

 Thank you for your review.  I have posted v5 now.  Since the move to a 
generic quirk required changes elsewhere this has now become a patch 
series including 4 changes total.  I have retained the original subject 
verbatim with the cover letter so as to give a chance to mail user agents 
to show the new series next to older revisions of the original patch (mine 
does).

  Maciej

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ