lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 14 Oct 2021 00:08:31 +0200
From:   Jonas Dreßler <verdre@...d.nl>
To:     Bjorn Helgaas <helgaas@...nel.org>
Cc:     Pali Rohár <pali@...nel.org>,
        Amitkumar Karwar <amitkarwar@...il.com>,
        Ganapathi Bhat <ganapathi017@...il.com>,
        Xinming Hu <huxinming820@...il.com>,
        Kalle Valo <kvalo@...eaurora.org>,
        "David S. Miller" <davem@...emloft.net>,
        Jakub Kicinski <kuba@...nel.org>,
        Tsuchiya Yuto <kitakar@...il.com>,
        linux-wireless@...r.kernel.org, netdev@...r.kernel.org,
        linux-kernel@...r.kernel.org, linux-pci@...r.kernel.org,
        Maximilian Luz <luzmaximilian@...il.com>,
        Andy Shevchenko <andriy.shevchenko@...ux.intel.com>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        Heiner Kallweit <hkallweit1@...il.com>,
        Johannes Berg <johannes@...solutions.net>,
        Brian Norris <briannorris@...omium.org>,
        David Laight <David.Laight@...LAB.COM>,
        Vidya Sagar <vidyas@...dia.com>,
        Victor Ding <victording@...gle.com>
Subject: Re: [PATCH] mwifiex: Add quirk resetting the PCI bridge on MS Surface
 devices

On 10/12/21 17:39, Bjorn Helgaas wrote:
> [+cc Vidya, Victor, ASPM L1.2 config issue; beginning of thread:
> https://lore.kernel.org/all/20211011134238.16551-1-verdre@v0yd.nl/]
> 
> On Tue, Oct 12, 2021 at 10:55:03AM +0200, Jonas Dreßler wrote:
>> On 10/11/21 19:02, Pali Rohár wrote:
>>> On Monday 11 October 2021 15:42:38 Jonas Dreßler wrote:
>>>> The most recent firmware (15.68.19.p21) of the 88W8897 PCIe+USB card
>>>> reports a hardcoded LTR value to the system during initialization,
>>>> probably as an (unsuccessful) attempt of the developers to fix firmware
>>>> crashes. This LTR value prevents most of the Microsoft Surface devices
>>>> from entering deep powersaving states (either platform C-State 10 or
>>>> S0ix state), because the exit latency of that state would be higher than
>>>> what the card can tolerate.
>>>
>>> This description looks like a generic issue in 88W8897 chip or its
>>> firmware and not something to Surface PCIe controller or Surface HW. But
>>> please correct me if I'm wrong here.
>>>
>>> Has somebody 88W8897-based PCIe card in non-Surface device and can check
>>> or verify if this issue happens also outside of the Surface device?
>>>
>>> It would be really nice to know if this is issue in Surface or in 8897.
>>
>> Fairly sure the LTR value is something that's reported by the firmware
>> and will be the same on all 8897 devices (as mentioned in my reply to Bjorn
>> the second-latest firmware doesn't report that fixed LTR value).
> 
> I suggested earlier that the LTR values reported by the device might
> depend on the electrical characteristics of the link and hence be
> platform-dependent, but I think that might be wrong.
> 
> The spec (PCIe r5.0, sec 5.5.4) does say that some of the *other*
> parameters related to L1.2 entry are platform-dependent:
> 
>    Prior to setting either or both of the enable bits for L1.2, the
>    values for TPOWER_ON, Common_Mode_Restore_Time, and, if the ASPM
>    L1.2 Enable bit is to be Set, the LTR_L1.2_THRESHOLD (both Value
>    and Scale fields) must be programmed.  The TPOWER_ON and
>    Common_Mode_Restore_Time fields must be programmed to the
>    appropriate values based on the components and AC coupling
>    capacitors used in the connection linking the two components. The
>    determination of these values is design implementation specific.
> 
> These T_POWER_ON, Common_Mode_Restore_Time, and LTR_L1.2_THRESHOLD
> values are in the L1 PM Substates Control registers.
> 
> I don't know of a way for the kernel or the device firmware to learn
> these circuit characteristics or the appropriate values, so I think
> only system firmware can program the L1 PM Substates Control registers
> (a corollary of this is that I don't see a way for hot-plugged devices
> to *ever* use L1.2).
> 
> I wonder if this reset quirk works because pci_reset_function() saves
> and restores much of config space, but it currently does *not* restore
> the L1 PM Substates capability, so those T_POWER_ON,
> Common_Mode_Restore_Time, and LTR_L1.2_THRESHOLD values probably get
> cleared out by the reset.  We did briefly save/restore it [1], but we
> had to revert that because of a regression that AFAIK was never
> resolved [2].  I expect we will eventually save/restore this, so if
> the quirk depends on it *not* being restored, that would be a problem.
> 
> You should be able to test whether this is the critical thing by
> clearing those registers with setpci instead of doing the reset.  Per
> spec, they can only be modified when L1.2 is disabled, so you would
> have to disable it via sysfs (for the endpoint, I think)
> /sys/.../l1_2_aspm and /sys/.../l1_2_pcipm, do the setpci on the root
> port, then re-enable L1.2.
> 
> [1] https://git.kernel.org/linus/4257f7e008ea
> [2] https://lore.kernel.org/all/20210127160449.2990506-1-helgaas@kernel.org/
> 

Hmm, interesting, thanks for those links.

Are you sure the config values will get lost on the reset? If we only reset
the port by going into D3hot and back into D0, the device will remain powered
and won't lose the config space, will it?

Because when I reset the bridge using pci_reset_function() (ie. pci_pm_reset())
or when I suspend and resume the laptop, all the L1 PM Substates registers are
still the same as before, nothing is lost.

That said, our new mwifiex_pcie_reset_d3cold_quirk() puts *both the card and
the bridge* into D3cold, so I gave that a try, and indeed the cards L1 Substate
Ctl registers are cleared out (so T_CommonMode, LTR1.2_Threshold and T_PwrOn),
but the bridge still has its values, no clue why that's the case.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ