lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 27 Oct 2022 14:56:19 -0500
From:   "Limonciello, Mario" <mario.limonciello@....com>
To:     Lukas Wunner <lukas@...ner.de>
Cc:     "Rafael J. Wysocki" <rafael@...nel.org>,
        Len Brown <lenb@...nel.org>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        Mika Westerberg <mika.westerberg@...ux.intel.com>,
        Mehta Sanju <Sanju.Mehta@....com>,
        "Rafael J . Wysocki" <rafael.j.wysocki@...el.com>,
        linux-acpi@...r.kernel.org, linux-pci@...r.kernel.org,
        linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4] PCI/ACPI: PCI/ACPI: Validate devices with power
 resources support D3

On 10/27/2022 00:24, Lukas Wunner wrote:
> On Wed, Oct 26, 2022 at 04:52:37PM -0500, Mario Limonciello wrote:
>> Firmware typically advertises that ACPI devices that represent PCIe
>> devices can support D3 by a combination of the value returned by
>> _S0W as well as the HotPlugSupportInD3 _DSD [1].
>>
>> `acpi_pci_bridge_d3` looks for this combination but also contains
>> an assumption that if an ACPI device contains power resources the PCIe
>> device it's associated with can support D3.  This was introduced
>> from commit c6e331312ebf ("PCI/ACPI: Whitelist hotplug ports for
>> D3 if power managed by ACPI").
>>
>> Some firmware configurations for "AMD Pink Sardine" do not support
>> wake from D3 in _S0W for the ACPI device representing the PCIe root
>> port used for tunneling. The PCIe device will still be opted into
>> runtime PM in the kernel [2] because of the logic within
>> `acpi_pci_bridge_d3`. This currently happens because the ACPI
>> device contains power resources.
> 
> So put briefly, in acpi_pci_bridge_d3() we fail to take wake capabilities
> into account and blindly assume that a bridge can be runtime suspended
> to D3 if it is power-manageable by ACPI.
> 
> By moving the acpi_pci_power_manageable() below the wake capabilities
> checks, we avoid runtime suspending a bridge that is not wakeup capable.
> 

Yes, spot on.

> The more verbose explanation in the commit message is useful to
> understand how the issue was exposed, but it somewhat obscures
> the issue itself.

Within this lengthy commit message I attempted to follow the model of:

"Status quo"
"Background"
"Problem Statement"
"Impact"
"Solution"

> 
> 
>> When the thunderbolt driver is loaded two device links are created:
>> * USB4 router <-> PCIe root port for tunneling
>> * USB4 router <-> XHCI PCIe device
> 
> Those double arrows are a little misleading, a device link is
> unidirectional, so it's really <-- and not <->.

Yes, that's correct.  Thanks.

> 
> 
>> Currently runtime PM is allowed for all of these devices.  This means that
>> when all consumers are idle long enough, they will enter their deepest allowed
>> sleep state. Once all consumers are in their deepest allowed sleep state the
>> suppliers will enter the deepest sleep state as well.
>>
>> * The PCIe root port for tunneling doesn't support waking from D3hot or
>>    D3cold so it stays in D0.
> 
> Huh?  I thought it's runtime suspended to D3hot even though it should stay
> runtime resumed in D0 because it's not wakeup capable in D3hot?

This is why I included the power state information in my topology diagram.

It's runtime suspended, but as it can't wake from D3hot it is 
"suspended" to D0.

$ cat /sys/bus/pci/devices/0000\:00\:03.1/power/control
auto
$ cat /sys/bus/pci/devices/0000\:00\:03.1/power/runtime_enabled
enabled
$ cat /sys/bus/pci/devices/0000\:00\:03.1/power/runtime_status
suspended
$ cat /sys/bus/pci/devices/0000\:00\:03.1/power_state
D0

> 
> 
>> * The XHCI PCIe device supports wakeup from D3cold so it goes to D3cold.
>> * Both consumers are in their deepest state and the USB4 router supports
>>    wakeup from D3cold, so it goes into this state.
>>
>> The expectation is the USB4 router should have also remained in D0 since
>> the PCIe root port for tunneling remained in D0 and a device link exists
>> between the two devices.
>  > This paragraph sounds like the problem is the router runtime suspended.
> IIUC the router could only runtime suspend because its consumer, the
> Root Port, runtime suspended.  By preventing the Root Port from runtime
> suspending, you're implicitly preventing it's supplier (the router)
> from suspending.

Yes, but I think it's a matter of perspective.  Both of these PCIe 
devices are exposing interfaces to different parts of the same SoC.
This issue was identified because this sequence of events in the kernel 
leads to unexpected power sequencing within the USB4 IP.

 From the perspective of the silicon designer the USB4 router shouldn't 
have "been able" to go into D3 until the PCIe root port for tunneling 
went into D3.  When the firmware prohibited the PCIe root port for 
tunneling to go into D3 this should implicitly prohibit the USB4 router 
as well.

I'll attempt to adjust my wording accordingly.

> 
> 
>> Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flearn.microsoft.com%2Fen-us%2Fwindows-hardware%2Fdrivers%2Fpci%2Fdsd-for-pcie-root-ports%23identifying-pcie-root-ports-supporting-hot-plug-in-d3&amp;data=05%7C01%7Cmario.limonciello%40amd.com%7Cd7450aa3d87e43996a5c08dab7db79dd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638024450531138458%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=1GCD4G5n79pldE3zOD7%2F3CCjdHY4qgzIRT5YHajbLEY%3D&amp;reserved=0 [1]
>> Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftorvalds%2Flinux%2Fblob%2Fv6.1-rc1%2Fdrivers%2Fpci%2Fpcie%2Fportdrv_pci.c%23L126&amp;data=05%7C01%7Cmario.limonciello%40amd.com%7Cd7450aa3d87e43996a5c08dab7db79dd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638024450531138458%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=Wr%2FDRQNQrl6EE2dJRWG2SVJ4QQkIsjSejM84nJE4R4g%3D&amp;reserved=0 [2]
>> Link: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Ftorvalds%2Flinux%2Fblob%2Fv6.1-rc1%2Fdrivers%2Fthunderbolt%2Facpi.c%23L29&amp;data=05%7C01%7Cmario.limonciello%40amd.com%7Cd7450aa3d87e43996a5c08dab7db79dd%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C638024450531138458%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=ghZJRgT2xWjYFYOWoC77y%2B5Jn8ZlCErOgswjeQQbfCM%3D&amp;reserved=0 [3]
> 
> I think git.kernel.org links are preferred to 3rd party hosting services.

I wasn't aware of any such policy.  Within the last release it seemed to 
me Github was perfectly acceptable to use for links.

$ git log v6.0..v6.1-rc1 | grep "Link: https://github" | wc -l
107
$ git log v6.0..v6.1-rc1 | grep "Link: https://git.kernel.org" | wc -l
2

> 
> Thanks,
> 
> Lukas

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ