lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Tue, 5 Sep 2023 15:51:05 -0500
From:   Bjorn Helgaas <helgaas@...nel.org>
To:     Mario Limonciello <mario.limonciello@....com>
Cc:     hdegoede@...hat.com, bhelgaas@...gle.com, rafael@...nel.org,
        Shyam-sundar.S-k@....com, linux-kernel@...r.kernel.org,
        linux-acpi@...r.kernel.org, platform-driver-x86@...r.kernel.org,
        Iain Lane <iain@...ngesquash.org.uk>
Subject: Re: [PATCH v16 3/3] platform/x86/amd: pmc: Don't let PCIe root ports
 go into D3

[+cc Hans]

On Tue, Aug 29, 2023 at 12:12:12PM -0500, Mario Limonciello wrote:
> commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend")
> changed pci_bridge_d3_possible() so that any vendor's PCIe ports
> from modern machines (>=2015) are allowed to be put into D3.
> 
> Iain reports that USB devices can't be used to wake a Lenovo Z13
> from suspend. This is because the PCIe root port has been put
> into D3 and AMD's platform can't handle USB devices waking from
> a hardware sleep state in this case.

Can you be specific in the subject and commit log about whether "D3"
refers to "D3hot", "D3cold", or both?  It's probably obvious to PM
folks, but it's always a stumbling block for me.

I assume "can't handle USB devices waking" does not refer to a problem
with the USB adapter and whatever mechanism it uses to send a wakeup
event to request that power be turned on, but rather it means that the
wakeup event doesn't get propagated through the Root Port?

Is this actually specific to USB devices?  Or could a NIC below the
Root Port suffer the same problem when a wake-on-lan packet causes it
to send a wakeup event?  It seems like we've had this conversation
before; sorry to ask the same questions again.

If it's not specific to USB, I would say something like "when the Root
Port is in D3cold, wakeup events from devices below it are lost" (or
whatever the actual problem is).

> This problem only occurs on Linux, and only when the AMD PMC driver
> is utilized to put the device into a hardware sleep state.

Is the AMD PMC driver doing something magic that can't be done via
other power management paths?  That's what "only when the AMD PMC
driver is utilized" suggests.  But if the problem occurs when the Root
Port is put into D3cold via *any* means, just say that.

And if you can say a specific PCI power state instead of "hardware
sleep state", that would be good, too.

> Comparing
> the behavior on Windows and Linux, Windows doesn't put the root ports
> into D3.
> 
> A variety of approaches were discussed to change PCI core to handle this
> case generically but no consensus was reached. To limit the scope of
> effect only to the affected machines introduce a workaround into the
> amd-pmc driver to only apply to the PCI root ports in affected machines
> when going into hardware sleep.

> +/* only allow PCIe root ports with a LPS0 constraint configured to go to D3 */
> +static int amd_pmc_rp_wa(struct amd_pmc_dev *pdev)
> +{
> +	struct pci_dev *pci_dev = NULL;
> +
> +	while ((pci_dev = pci_get_device(PCI_VENDOR_ID_AMD, PCI_ANY_ID, pci_dev))) {

I hate to add more uses of pci_get_device() because it doesn't account
for hot-added devices.  Maybe there's no need to support hot-add of
AMD Root Ports, but that's not obvious to readers here.

One mechanism to avoid pci_get_device() is to use quirks, although it
might be hard to deal with PCI/ACPI ordering issues.

> +		struct acpi_device *adev;
> +		int constraint;
> +
> +		if (!pci_is_pcie(pci_dev) ||
> +		    !(pci_pcie_type(pci_dev) == PCI_EXP_TYPE_ROOT_PORT))
> +			continue;
> +
> +		if (pci_dev->current_state == PCI_D3hot ||
> +		    pci_dev->current_state == PCI_D3cold)
> +			continue;

If we're trying to determine a property of the device, why does the
current power state make a difference?

It looks like this loop runs every time we suspend (from
amd_pmc_suspend_handler()), even though this is something we should
know at boot-time, so we only need it once.

> +		adev = ACPI_COMPANION(&pci_dev->dev);
> +		if (!adev)
> +			continue;
> +
> +		constraint = acpi_get_lps0_constraint(adev);
> +		if (constraint != ACPI_STATE_UNKNOWN &&
> +		    constraint >= ACPI_STATE_S3)
> +			continue;
> +
> +		if (pci_dev->bridge_d3 == 0)
> +			continue;
> +		pci_dev->bridge_d3 = 0;
> +		dev_info(&pci_dev->dev, "Disabling D3 on PCIe root port due lack of constraint\n");

D3hot?  D3cold?  Both?  "lack of constraint"?

> +	}
> +
> +	return 0;
> +}
> +
>  static int amd_pmc_verify_czn_rtc(struct amd_pmc_dev *pdev, u32 *arg)
>  {
>  	struct rtc_device *rtc_device;
> @@ -893,6 +928,10 @@ static int amd_pmc_suspend_handler(struct device *dev)
>  	case AMD_CPU_ID_CZN:
>  		rc = amd_pmc_czn_wa_irq1(pdev);
>  		break;
> +	case AMD_CPU_ID_YC:
> +	case AMD_CPU_ID_PS:
> +		rc = amd_pmc_rp_wa(pdev);
> +		break;
>  	default:
>  		break;
>  	}
> -- 
> 2.34.1
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ