lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date:   Wed, 26 Oct 2022 14:41:56 -0500
From:   "Limonciello, Mario" <mario.limonciello@....com>
To:     Bjorn Helgaas <helgaas@...nel.org>
Cc:     "Rafael J. Wysocki" <rafael@...nel.org>,
        Len Brown <lenb@...nel.org>,
        Bjorn Helgaas <bhelgaas@...gle.com>,
        Mika Westerberg <mika.westerberg@...ux.intel.com>,
        Mehta Sanju <Sanju.Mehta@....com>,
        Lukas Wunner <lukas@...ner.de>, linux-acpi@...r.kernel.org,
        linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v3] PCI/ACPI: PCI/ACPI: Validate devices with power
 resources support D3

On 10/26/2022 14:09, Bjorn Helgaas wrote:
> Hi Mario,
> 
> Thanks for expanding the commit log.  I'm sure this patch is the right
> thing to do; I just want to connect a few more dots to make it less
> likely that we'll break this in the future.

OK.  I'll pick up those two tags from Mika and Rafael and try to reword 
the commit message a bit.

> 
> On Tue, Oct 25, 2022 at 05:10:54PM -0500, Mario Limonciello wrote:
>> Firmware typically advertises that PCIe devices can support D3
>> by a combination of the value returned by _S0W as well as the
>> HotPlugSupportInD3 _DSD.
> 
> All PCI devices are required to support both D3hot and D3cold (PCIe
> r6.0, sec 5.3.1.4), so I think what's being advertised here is about
> what firmware can do (which of course implicitly depends on controls
> provided by the platform hardware), not what the *device* supports.
> 
> The OS can put a device in D3hot by itself with the PM Control
> register, so I assume the important thing here is whether firmware has
> interfaces to put a device in D3cold and bring it back to D0.

If you completely ignore _S0W, sure you can put the device into D3hot by 
changing this register but while in the firmware configuration I 
describe you won't get wake events to pull you out of it.

So the PCIe device will stay in this state until the OS does something.

> 
> I know we only get to acpi_pci_bridge_d3() for PCIe devices, but when
> the device properties and ACPI interfaces are not PCIe-specific, I
> don't think we should restrict it by saying "PCIe".

Thanks for clarifying.  I'm thinking I'll s/PCIe/ACPI/ in the commit
message to convey this.

> 
>> `acpi_pci_bridge_d3` looks for this combination but also contains
>> an assumption that if a device contains power resources it can support
>> D3.  This was introduced from commit c6e331312ebf ("PCI/ACPI: Whitelist
>> hotplug ports for D3 if power managed by ACPI").
>>
>> On some firmware configurations for "AMD Pink Sardine" D3 is not
>> supported for wake in _S0W for the PCIe root port for tunneling.
>> However the device will still be opted into runtime PM since
>> `acpi_pci_bridge_d3` returns since the ACPI device contains power
>> resources.
>>
>> When the thunderbolt driver is loaded a device link between the USB4
>> router and the PCIe root port for tunneling is created where the PCIe
>> root port for tunneling is the consumer and the USB4 router is the
>> supplier.  Here is a demonstration of this topology that occurs:
>>
>> ├─ 0000:00:03.1
>> |       | ACPI Path: \_SB_.PCI0.GP11 (Supports "0" in _S0W)
>> |       | Device Links: supplier:pci:0000:c4:00.5
>> |       └─ D0 (Runtime PM enabled)
>> ├─ 0000:00:04.1
>> |       | ACPI Path: \_SB_.PCI0.GP12 (Supports "0" in _S0W)
>> |       | Device Links: supplier:pci:0000:c4:00.6
>> |       └─ D0 (Runtime PM enabled)
>> ├─ 0000:00:08.3
>> |       | ACPI Path: \_SB_.PCI0.GP19
>> |       ├─ D0 (Runtime PM disabled)
>> |       ├─ 0000:c4:00.3
>> |       |       | ACPI Path: \_SB_.PCI0.GP19.XHC3
>> |       |       | Device Links: supplier:pci:0000:c4:00.5
>> |       |       └─ D3cold (Runtime PM enabled)
>> |       ├─ 0000:c4:00.4
>> |       |       | ACPI Path: \_SB_.PCI0.GP19.XHC4
>> |       |       | Device Links: supplier:pci:0000:c4:00.6
>> |       |       └─ D3cold (Runtime PM enabled)
>> |       ├─ 0000:c4:00.5
>> |       |       | ACPI Path: \_SB_.PCI0.GP19.NHI0 (Supports "4" in _S0W)
>> |       |       | Device Links: consumer:pci:0000:00:03.1 consumer:pci:0000:c4:00.3
>> |       |       └─ D3cold (Runtime PM enabled)
>> |       └─ 0000:c4:00.6
>> |               | ACPI Path: \_SB_.PCI0.GP19.NHI1 (Supports "4" in _S0W)
>> |               | Device Links: consumer:pci:0000:c4:00.4 consumer:pci:0000:00:04.1
>> |               └─ D3cold (Runtime PM enabled)
> 
> Can you label the devices above to correspond with the preceding
> paragraph?  I assume one of the XHC devices is the USB4 router, but I
> don't know which is the Root Port.

In the above example there are 2 USB4 routers, 2 PCIe root ports used 
for tunneling and 2 XHCI PCIe devices.

> 
> Are all the devices relevant to the problem?  If not, prune the ones
> that don't matter.  It looks like the domain ("0000") could also be
> pruned out.

They're all relevant because links are made between them.  If the XHCI
PCIe device was not in runtime PM, it could prevent the router from 
going into runtime PM as well just the same.

I could remove one triplet of devices to keep it simpler (USB4 router, 
XHCI PCIe device and PCIe root port for tunneling) but all systems I've 
seen have 2.

> 
> If you also include the _PR0 and/or _PS0 methods, we'll be able to see
> why the current code doesn't do what we want and why the new code
> will.

OK, I'll add a line/assertion which have _PS0/_PR0.

> 
> What determines the device links?  I assume there's some ACPI
> information that connects the USB4 router with the Root Port?

They're created when the firmware node has "usb4-host-interface".  I'll 
add some lines to describe this as well to the commit message.

https://github.com/torvalds/linux/blob/v6.1-rc1/drivers/thunderbolt/acpi.c#L29

> 
> What are the "D0" and "D3cold" annotations telling me?  What does
> "runtime PM enabled" mean?  Is that determined based on some ACPI
> methods?

D0/D3cold are tell you at rest what power states each ACPI device is in 
as read from /sys/bus/pci/devices/{bdf}/power_state

The USB4 routers are in D3cold, which shouldn't have occurred because 
the PCIe root ports for tunneling are in D0.

This only happened because runtime PM was enabled on the PCIe root port 
for tunneling, and runtime PM was enabled because acpi_pci_bridge_d3 
asserted that it supported it.

> 
>> Allowing the PCIe root port for tunneling to go into runtime PM (even if
>> it doesn't support D3) allows the USB4 router to also go into runtime PM.
>> The PCIe root port for tunneling stays in D0 but is in runtime PM. Due to
>> the device link the USB4 router transitions to D3cold when this happens.
> 
> It's probably obvious to PM folks what "going into runtime PM" means,
> but it would help me out to describe it in terms of the hardware state
> of the device, e.g., D3hot or whatever.

I think a simplified description here is "the device has been put into 
the deepest sleep state that it can wake itself from at runtime".

Suppliers can't go into runtime PM until all consumers of the device 
have done so.  If a consumer doesn't support runtime PM then it will 
block the supplier from entering runtime PM.

> 
>> The expectation is the USB4 router should have also remained in D0 since
>> the PCIe root port for tunneling remained in D0.
>>
>> Instead of making this assertion from the power resources check
>> immediately, move the check to later on, which will have validated
>> that the device supports wake from D3hot or D3cold.
>>
>> This fix prevents the USB4 router going into D3 when the firmware says that
>> the PCIe root port for tunneling can't handle it while still allowing
>> system that don't have the HotplugSupportInD3 _DSD to also enter D3 if they
>> have power resources that can wake from D3.
> 
> I guess there's a theme here of looking for concrete terms that I can
> connect directly to the spec vs abstract things like "going into
> runtime PM" or "root port can't handle D3" (which I think is actually
> saying something about what *firmware* can do).

The UBS4 CM spec doesn't describe the ACPI relationships.
So I'm afraid the best I can come up with is what Microsoft says:

"For the PCIe and USB 3.x software stacks to establish power relations 
with the USB4 host router, device-specific data (_DSD) for the tunneled 
PCIe and USB 3.x ports is required. In the absence of this, the USB4 
domain may power down without coordinating with the tunneled PCIe and 
USB 3.x devices."

https://learn.microsoft.com/en-us/windows-hardware/design/component-guidelines/usb4-acpi-requirements#port-mapping-_dsd-for-usb-3x-and-pcie

"Runtime PM" is the power relation that is used for Linux.

Back when we did dff6139015dc6 earlier this year the Yellow Carp systems 
didn't have power resources declared in this particularly firmware 
configuration so the behavior caused by c6e331312ebf didn't negatively 
affect anything.

If they did, I'd like to think I would have done this right the first 
time =).

> 
>> Fixes: dff6139015dc6 ("PCI/ACPI: Allow D3 only if Root Port can signal and wake from D3")
>> Signed-off-by: Mario Limonciello <mario.limonciello@....com>
>> ---
>> v2->v3:
>>   * Reword commit message
>> v1->v2:
>>   * Just return value of acpi_pci_power_manageable (Rafael)
>>   * Remove extra word in commit message
>> ---
>>   drivers/pci/pci-acpi.c | 7 ++-----
>>   1 file changed, 2 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
>> index a46fec776ad77..8c6aec50dd471 100644
>> --- a/drivers/pci/pci-acpi.c
>> +++ b/drivers/pci/pci-acpi.c
>> @@ -984,10 +984,6 @@ bool acpi_pci_bridge_d3(struct pci_dev *dev)
>>   	if (acpi_pci_disabled || !dev->is_hotplug_bridge)
>>   		return false;
>>   
>> -	/* Assume D3 support if the bridge is power-manageable by ACPI. */
>> -	if (acpi_pci_power_manageable(dev))
>> -		return true;
>> -
>>   	rpdev = pcie_find_root_port(dev);
>>   	if (!rpdev)
>>   		return false;
>> @@ -1023,7 +1019,8 @@ bool acpi_pci_bridge_d3(struct pci_dev *dev)
>>   	    obj->integer.value == 1)
>>   		return true;
>>   
>> -	return false;
>> +	/* Assume D3 support if the bridge is power-manageable by ACPI. */
>> +	return acpi_pci_power_manageable(dev);
>>   }
>>   
>>   int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
>> -- 
>> 2.34.1
>>

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ