lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <823c393d-49f6-402b-ae8b-38ff44aeabc4@amd.com>
Date: Tue, 10 Dec 2024 10:00:59 -0600
From: Mario Limonciello <mario.limonciello@....com>
To: Werner Sembach <wse@...edocomputers.com>,
 Bjorn Helgaas <bhelgaas@...gle.com>,
 "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
 Mika Westerberg <mika.westerberg@...ux.intel.com>
Cc: ggo@...edocomputers.com, linux-pci@...r.kernel.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] PCI: Avoid putting some root ports into D3 on some
 Ryzen chips

On 12/10/2024 09:24, Werner Sembach wrote:
> Hi,
> 
> Am 09.12.24 um 20:45 schrieb Mario Limonciello:
>> On 12/9/2024 13:36, Werner Sembach wrote:
>>> From: Mario Limonciello <mario.limonciello@....com>
>>>
>>> commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend")
>>> sets the policy that all PCIe ports are allowed to use D3.  When
>>> the system is suspended if the port is not power manageable by the
>>> platform and won't be used for wakeup via a PME this sets up the
>>> policy for these ports to go into D3hot.
>>>
>>> This policy generally makes sense from an OSPM perspective but it leads
>>> to problems with wakeup from suspend on the TUXEDO Sirius 16 Gen 1 with
>>> an unupdated BIOS.
>>>
>>> - On family 19h model 44h (PCI 0x14b9) this manifests as a missing 
>>> wakeup
>>>    interrupt.
>>> - On family 19h model 74h (PCI 0x14eb) this manifests as a system hang.
>>>
>>> On the affected Device + BIOS combination, add a quirk for the PCI 
>>> device
>>> ID used by the problematic root port on both chips to ensure that these
>>> root ports are not put into D3hot at suspend.
>>>
>>> This patch is based on
>>> https://lore.kernel.org/linux-pci/20230708214457.1229-2- 
>>> mario.limonciello@....com/
>>> but with the added condition both in the documentation and in the 
>>> code to
>>> apply only to the TUXEDO Sirius 16 Gen 1 with the original unpatched 
>>> BIOS.
>>>
>>> Co-developed-by: Georg Gottleuber <ggo@...edocomputers.com>
>>> Signed-off-by: Georg Gottleuber <ggo@...edocomputers.com>
>>> Co-developed-by: Werner Sembach <wse@...edocomputers.com>
>>> Signed-off-by: Werner Sembach <wse@...edocomputers.com>
>>> Cc: stable@...r.kernel.org # 6.1+
>>> Reported-by: Iain Lane <iain@...ngesquash.org.uk>
>>> Closes: https://forums.lenovo.com/t5/Ubuntu/Z13-can-t-resume-from- 
>>> suspend-with-external-USB-keyboard/m-p/5217121
>>> Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend")
>>> Signed-off-by: Mario Limonciello <mario.limonciello@....com>
>>> ---
>>>   drivers/pci/quirks.c | 31 +++++++++++++++++++++++++++++++
>>>   1 file changed, 31 insertions(+)
>>>
>>> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
>>> index 76f4df75b08a1..2226dca56197d 100644
>>> --- a/drivers/pci/quirks.c
>>> +++ b/drivers/pci/quirks.c
>>> @@ -3908,6 +3908,37 @@ static void 
>>> quirk_apple_poweroff_thunderbolt(struct pci_dev *dev)
>>>   DECLARE_PCI_FIXUP_SUSPEND_LATE(PCI_VENDOR_ID_INTEL,
>>>                      PCI_DEVICE_ID_INTEL_CACTUS_RIDGE_4C,
>>>                      quirk_apple_poweroff_thunderbolt);
>>> +
>>> +/*
>>> + * Putting PCIe root ports on Ryzen SoCs with USB4 controllers into 
>>> D3hot
>>> + * may cause problems when the system attempts wake up from s2idle.
>>> + *
>>> + * On family 19h model 44h (PCI 0x14b9) this manifests as a missing 
>>> wakeup
>>> + * interrupt.
>>> + * On family 19h model 74h (PCI 0x14eb) this manifests as a system 
>>> hang.
>>> + *
>>> + * This fix is still required on the TUXEDO Sirius 16 Gen1 with the 
>>> original
>>> + * unupdated BIOS.
>>> + */
>>> +static const struct dmi_system_id quirk_ryzen_rp_d3_dmi_table[] = {
>>> +    {
>>> +        .matches = {
>>> +            DMI_MATCH(DMI_SYS_VENDOR, "TUXEDO"),
>>> +            DMI_MATCH(DMI_BOARD_NAME, "APX958"),
>>> +            DMI_MATCH(DMI_BIOS_VERSION, "V1.00A00"),
>>> +        },
>>> +    },
>>> +    {}
>>> +};
>>> +
>>> +static void quirk_ryzen_rp_d3(struct pci_dev *pdev)
>>> +{
>>> +    if (dmi_check_system(quirk_ryzen_rp_d3_dmi_table) &&
>>> +        !acpi_pci_power_manageable(pdev))
>>> +        pdev->dev_flags |= PCI_DEV_FLAGS_NO_D3;
>>> +}
>>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x14b9, quirk_ryzen_rp_d3);
>>> +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x14eb, quirk_ryzen_rp_d3);
>>>   #endif
>>>     /*
>>
>> Wait, what is wrong with:
>>
>> commit 7d08f21f8c630 ("x86/PCI: Avoid PME from D3hot/D3cold for AMD 
>> Rembrandt and Phoenix USB4")
>>
>> Is that not activating on your system for some reason?
> 
> Doesn't seem so, tested with the old BIOS and 6.13-rc2 and had 
> blackscreen on wakeup.

OK, I think we need to dig a layer deeper to see which root port is 
causing problems to understand this.

> 
> With a newer BIOS for that device suspend and resume however works.
> 

Is there any reason that people would realistically be staying on the 
old BIOS and instead we need to carry this quirk in the kernel for eternity?

With the Linux ecosystem for BIOS updates through fwupd + LVFS it's not 
a very big barrier to entry to do an update like it was 20 years ago.

> Looking in the patch the device id's are different (0x162e, 0x162f, 
> 0x1668, and 0x1669).
> 

TUXEDO Sirius 16 Gen1 is Phoenix based, right?  So at a minimum you 
shouldn't be including PCI IDs from Rembrandt (0x14b9)

Here is the topology from a Phoenix system on my side:

https://gist.github.com/superm1/85bf0c053008435458bdb39418e109d8

That's why 7d08f21f8c630 intentionally matches the NHI and then changes 
the root port right above that instead of all the root ports - because 
that is where the problem was.

You can see the PCIe ID 0x14eb covers quite a few root ports for a lot 
of devices.
If you're disabling D3 for all of them, that's going to be too broad.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ