linux-kernel - Re: [PATCH 1/1] PCI: Use the correct bit in Link Training not active check

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c4fe9080-245f-7089-84c1-bb47dcf2cd83@linux.intel.com>
Date: Thu, 14 Mar 2024 13:39:24 +0200 (EET)
From: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
To: "Maciej W. Rozycki" <macro@...am.me.uk>
cc: Bjorn Helgaas <bhelgaas@...gle.com>, linux-pci@...r.kernel.org, 
    LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/1] PCI: Use the correct bit in Link Training not active
 check

On Wed, 6 Mar 2024, Ilpo Järvinen wrote:

> On Wed, 6 Mar 2024, Maciej W. Rozycki wrote:
> > On Mon, 4 Mar 2024, Ilpo Järvinen wrote:
> > 
> > > > > Since waiting for Data Link Layer Link Active bit is only used for the
> > > > > Target Speed quirk, this only impacts the case when the quirk attempts
> > > > > to restore speed to higher than 2.5 GT/s (The link is Up at that point
> > > > > so pcie_retrain_link() will fail).
> > > > 
> > > >  NAK.  It's used for both clamping and unclamping and it will break the 
> > > > workaround, because the whole point there is to wait until DLLA has been 
> > > > set.  Using LT is not reliable because it will oscillate in the failure 
> > > > case and seeing the bit clear does not mean link has been established.  
> > > 
> > > In pcie_retrain_link(), there are two calls into 
> > > pcie_wait_for_link_status() and the second one of them is meant to 
> > > implement the link-has-been-established check.
> > > 
> > > The first wait call comes from e7e39756363a ("PCI/ASPM: Avoid link 
> > > retraining race") and is just to ensure the link is not ongoing retraining 
> > > to make sure the latest configuration in captured as required by the 
> > > implementation note. LT being cleared is exactly what is wanted for that 
> > > check because it means that any earlier retraining has ended (a new one 
> > > might be starting but that doesn't matter, we saw it cleared so the new 
> > > configuration should be in effect for that instance of link retraining).
> > > 
> > > So my point is, the first check is not even meant to check that link has 
> > > been established.
> > 
> >  I see what you mean, and I now remember the note in the spec.  I had 
> > concerns about it, but did not do anything about it at that point.
> > 
> >  I think we still have no guarantee that LT will be clear at the point we 
> > set RL, because LT could get reasserted by hardware between our read and 
> > the setting of RL.
> >
> > IIUC that doesn't matter really, because the new link 
> > parameters will be taken into account regardless of whether retraining was
> > initiated by hardware in an attempt to do link recovery or triggered by 
> > software via RL.
> 
> I, too, was somewhat worried about having LT never clear for long enough 
> to successfully sample it during the wait but it's like you say, any new 
> link training should take account the new Target Speed which should 
> successfully bring the link up (assuming the quirk works in the first 
> place) and that should clear LT.

Hi,

One more point to add here, I started to wonder today why that use_lt 
parameter is needed at all for pcie_retrain_link()?

Once the Target Speed has been changed to 2.5GT/s which is what the quirk 
does before calling retraining, LT too should work "normally" after that.

-- 
 i.