lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 21 Jun 2021 16:28:55 +0200
From:   Pali Rohár <pali@...nel.org>
To:     Bjorn Helgaas <helgaas@...nel.org>
Cc:     Bjorn Helgaas <bhelgaas@...gle.com>,
        Kalle Valo <kvalo@...eaurora.org>,
        Toke Høiland-Jørgensen <toke@...hat.com>,
        Marek Behún <kabel@...nel.org>,
        Krzysztof Wilczyński <kw@...ux.com>,
        vtolkm@...il.com, Rob Herring <robh@...nel.org>,
        Ilias Apalodimas <ilias.apalodimas@...aro.org>,
        Thomas Petazzoni <thomas.petazzoni@...tlin.com>,
        linux-pci@...r.kernel.org, ath10k@...ts.infradead.org,
        linux-wireless@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] PCI: Disallow retraining link for Atheros chips on
 non-Gen1 PCIe bridges

On Wednesday 16 June 2021 16:38:19 Bjorn Helgaas wrote:
> On Wed, Jun 02, 2021 at 09:03:02PM +0200, Pali Rohár wrote:
> > On Wednesday 02 June 2021 10:55:59 Bjorn Helgaas wrote:
> > > On Wed, Jun 02, 2021 at 02:08:16PM +0200, Pali Rohár wrote:
> > > > On Tuesday 01 June 2021 19:00:36 Bjorn Helgaas wrote:
> > > 
> > > > > I wonder if this could be restructured as a generic quirk in quirks.c
> > > > > that simply set the bridge's TLS to 2.5 GT/s during enumeration.  Or
> > > > > would the retrain fail even in that case?
> > > > 
> > > > If I understand it correctly then PCIe link is already up when kernel
> > > > starts enumeration. So setting Bridge TLS to 2.5 GT/s does not change
> > > > anything here.
> > > > 
> > > > Moreover it would have side effect that cards which are already set to
> > > > 5+ GT/s would be downgraded to 2.5 GT/s during enumeration and for
> > > > increasing speed would be needed another round of "enumeration" to set a
> > > > new TLS and retrain link again. As TLS affects link only after link goes
> > > > into Recovery state.
> > > > 
> > > > So this would just complicate card enumeration and settings.
> > > 
> > > The current quirk complicates the ASPM code.  I'm hoping that if we
> > > set the bridge's Target Link Speed during enumeration, the link
> > > retrain will "just work" without complicating the ASPM code.
> > > 
> > > An enumeration quirk wouldn't have to set the bridge's TLS to 2.5
> > > GT/s; the quirk would be attached to specific endpoint devices and
> > > could set the bridge's TLS to whatever the endpoint supports.
> > 
> > Now I see what you mean. Yes, I agree this is a good idea and can
> > simplify code. Quirk is not related to ASPM code and basically has
> > nothing with it, just I put it into aspm.c because this is the only
> > place where link retraining was activated.
> > 
> > But with this proposal there is one issue. Some kernel drivers already
> > overwrite PCI_EXP_LNKCTL2_TLS value. So if PCI enumeration code set some
> > value into PCI_EXP_LNKCTL2_TLS bits then drivers can change it and once
> > ASPM will try to retrain link this may cause this issue.
> 
> I guess you mean the amdgpu, radeon, and hfi1 drivers.  They really
> shouldn't be mucking with that stuff anyway.  But they do and are
> unlikely to change because we don't have any good alternative.

Yea, these are examples of such drivers... Maybe it is a good idea to
ask those people why changing PCI_EXP_LNKCTL2_TLS is needed. As these
drivers are often derived from codebase of shared multisystem drivers or
from common documentation, it is possible that original source has this
code as a workaround or common pattern used in other operating systems,
not related to linux...

> One way around that would be to add some quirk code to
> pcie_capability_write_word().  Ugly, but we do have something sort of
> similar in pcie_capability_read_word() already.

Bjorn, do you really want such ugly hack in pcie_capability_write_word?
It is common code used and called from lot of places so it may affect
whole system if in future somebody changes it again...

Or we can let it as is, say that those drivers which are doing it are
buggy and for future try to reduce code which touches registers
PCI_EXP_LNKCTL2_TLS. Good code review or some checkpatch.pl warnings may
prevent introduction of other code which will do it.

> 
> Bjorn

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ