lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Tue, 30 Jan 2024 16:41:34 +0000 (GMT)
From: "Maciej W. Rozycki" <macro@...am.me.uk>
To: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
cc: Bjorn Helgaas <helgaas@...nel.org>, linux-pci@...r.kernel.org, 
    LKML <linux-kernel@...r.kernel.org>, 
    Mika Westerberg <mika.westerberg@...ux.intel.com>
Subject: Re: [PATCH 1/2] PCI: Clear LBMS on resume to avoid Target Speed
 quirk

On Tue, 30 Jan 2024, Ilpo Järvinen wrote:

> First of all, this is not true for pcie_failed_link_retrain():
>  * Return TRUE if the link has been successfully retrained, otherwise FALSE.
> If LBMS is not set, the Target Speed quirk is not applied but the function 
> still returns true. I think that should be changed to early return false
> when no LBMS is present.

 I think there is a corner case here indeed.  If Link Active reporting is 
supported and neither DLLLA nor LBMS are set at entry, then the function 
indeed returns success even though the link is down and no attempt to 
retrain will have been made.  So this does indeed look a case for a return 
with the FALSE result.

 I think most easily this can be sorted by delegating the return result to 
a temporary variable, preset to FALSE and then only updated from results 
of the calls to `pcie_retrain_link'.  I can offer a patch as the author of 
the code and one who has means to verify it right away if that helped.

 Overall I guess it's all legacy of how this code evolved before it's been 
finally merged.

> > >   3. pci_bridge_wait_for_secondary_bus() then calls pci_dev_wait() which
> > >      cannot succeed (but waits ~1 minute, delaying the resume).
> > > 
> > > The Target Speed trick (in step 2) is only used if LBMS bit (PCIe r6.1
> > > sec 7.5.3.8) is set. For links that have been operational before
> > > suspend, it is well possible that LBMS has been set at the bridge and
> > > remains on. Thus, after resume, LBMS does not indicate the link needs
> > > the Target Speed quirk. Clear LBMS on resume for bridges to avoid the
> > > issue.

 This can be problematic AFAICT however.  While I am not able to verify 
suspend/resume operation with my devices, I expect the behaviour to be 
exactly the same after resume as after a bus reset: the link will fail to 
negotiate and the LBMS and DLLLA bits will be respectively set and clear.  
Consequently if you clear LBMS at resume, then the workaround won't 
trigger and the link will remain inoperational in its limbo state.

 What kind of scenario does the LBMS bit get set in that you have a 
trouble with?  You write that in your case the downstream device has been 
disconnected while the bug hierarchy was suspended and it is not present 
anymore at resume, is that correct?

 But in that case no link negotiation could have been possible and 
consequently the LBMS bit mustn't have been set by hardware (according to 
my understanding of PCIe), because (for compliant, non-broken devices 
anyway) it is only specified to be set for ports that can communicate with 
the other link end (the spec explicitly says there mustn't have been a 
transition through the DL_Down status for the port).

 Am I missing something?

  Maciej

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ