lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240203000510.GA738687@bhelgaas>
Date: Fri, 2 Feb 2024 18:05:10 -0600
From: Bjorn Helgaas <helgaas@...nel.org>
To: Jian-Hong Pan <jhp@...lessos.org>
Cc: Johan Hovold <johan@...nel.org>,
	Mika Westerberg <mika.westerberg@...ux.intel.com>,
	David Box <david.e.box@...ux.intel.com>,
	Damien Le Moal <dlemoal@...nel.org>,
	Niklas Cassel <cassel@...nel.org>,
	Nirmal Patel <nirmal.patel@...ux.intel.com>,
	Jonathan Derrick <jonathan.derrick@...ux.dev>,
	linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org,
	linux@...lessos.org
Subject: Re: [PATCH v2] PCI: vmd: Enable PCI PM's L1 substates of remapped
 PCIe Root Port and NVMe

On Fri, Feb 02, 2024 at 03:11:12PM +0800, Jian-Hong Pan wrote:
> The remapped PCIe Root Port and NVMe have PCI PM L1 substates
> capability, but they are disabled originally:
> 
> Here is an example on ASUS B1400CEAE:
> 
> Capabilities: [900 v1] L1 PM Substates
>         L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
>                   PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
>         L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
>                    T_CommonMode=0us LTR1.2_Threshold=0ns
>         L1SubCtl2: T_PwrOn=10us
> 
> Power on all of the VMD remapped PCI devices and quirk max snoop LTR
> before enable PCI-PM L1 PM Substates by following "Section 5.5.4 of PCIe
> Base Spec Revision 6.0". Then, PCI PM's L1 substates control are
> initialized & enabled accordingly.

> Also, update the comments of
> pci_enable_link_state() and pci_enable_link_state_locked() for
> kernel-doc, too.

The aspm.c changes should be in a separate patch.  Presumably the
aspm.c code change fixes a defect, and that defect can be described in
that patch.  That fix may be needed for non-VMD situations, and having
it in this vmd patch means it won't be as easy to find and backport.

Nit: rewrap commit log to fill 75 columns.

> @@ -751,11 +751,9 @@ static int vmd_pm_enable_quirk(struct pci_dev *pdev, void *userdata)
>  	if (!(features & VMD_FEAT_BIOS_PM_QUIRK))
>  		return 0;
>  
> -	pci_enable_link_state_locked(pdev, PCIE_LINK_STATE_ALL);
> -
>  	pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_LTR);
>  	if (!pos)
> -		return 0;
> +		goto out_enable_link_state;
>  
>  	/*
>  	 * Skip if the max snoop LTR is non-zero, indicating BIOS has set it
> @@ -763,7 +761,7 @@ static int vmd_pm_enable_quirk(struct pci_dev *pdev, void *userdata)
>  	 */
>  	pci_read_config_dword(pdev, pos + PCI_LTR_MAX_SNOOP_LAT, &ltr_reg);
>  	if (!!(ltr_reg & (PCI_LTR_VALUE_MASK | PCI_LTR_SCALE_MASK)))
> -		return 0;
> +		goto out_enable_link_state;
>  
>  	/*
>  	 * Set the default values to the maximum required by the platform to
> @@ -775,6 +773,14 @@ static int vmd_pm_enable_quirk(struct pci_dev *pdev, void *userdata)
>  	pci_write_config_dword(pdev, pos + PCI_LTR_MAX_SNOOP_LAT, ltr_reg);
>  	pci_info(pdev, "VMD: Default LTR value set by driver\n");

You're not changing this part, and I don't understand exactly how LTR
works, but it makes me a little bit queasy to read "set the LTR value
to the maximum required to allow the deepest power management
savings" and then we set the max snoop values to a fixed constant.

I don't think the goal is to "allow the deepest power savings"; I
think it's to enable L1.2 *when the device has enough buffering to
absorb L1.2 entry/exit latencies*.

The spec (PCIe r6.0, sec 7.8.2.2) says "Software should set this to
the platform's maximum supported latency or less," so it seems like
that value must be platform-dependent, not fixed.

And I assume the "_DSM for Latency Tolerance Reporting" is part of the
way to get those platform-dependent values, but Linux doesn't actually
use that yet.

I assume that setting the max values incorrectly may lead to either
being too conservative, so we don't use L1.2 when we could, or too
aggressive, so we use L1.2 when we shouldn't, and the device loses
data because it doesn't have enough internal buffering to absorb the
entry/exit delays.

This paper has a lot of background and might help answer some of my
questions:
https://www.intel.co.za/content/dam/doc/white-paper/energy-efficient-platforms-white-paper.pdf

> +out_enable_link_state:
> +	/*
> +	 * Make PCI devices at D0 when enable PCI-PM L1 PM Substates from
> +	 * Section 5.5.4 of PCIe Base Spec Revision 6.0
> +	 */
> +	pci_set_power_state_locked(pdev, PCI_D0);
> +	pci_enable_link_state_locked(pdev, PCIE_LINK_STATE_ALL);

Hmmm.  PCIE_LINK_STATE_ALL includes ASPM L1.2.  We may do this even if
the device doesn't have an LTR Capability.  ASPM L1.2 cannot work
without LTR.

I only took a quick look but was not convinced that
pci_enable_link_state() does the right thing when we enable
PCIE_LINK_STATE_ALL (including ASPM L1.2) on a device that doesn't
have LTR.  I think it *should* decline to set PCI_L1SS_CTL1_ASPM_L1_2,
but I'm not sure it does.  Can you double check that path?  Maybe
that's another defect in aspm.c.

> @@ -1164,6 +1164,8 @@ static int __pci_enable_link_state(struct pci_dev *pdev, int state, bool locked)
>  		link->aspm_default |= ASPM_STATE_L1_1_PCIPM | ASPM_STATE_L1;
>  	if (state & PCIE_LINK_STATE_L1_2_PCIPM)
>  		link->aspm_default |= ASPM_STATE_L1_2_PCIPM | ASPM_STATE_L1;
> +	if (state & ASPM_STATE_L1_2_MASK)
> +		aspm_l1ss_init(link);
>  	pcie_config_aspm_link(link, policy_to_aspm_state(link));
>  
>  	link->clkpm_default = (state & PCIE_LINK_STATE_CLKPM) ? 1 : 0;

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ