lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <876d19b7702dc16010b56bce049e2ab60bf68e3b.camel@linux.intel.com>
Date: Fri, 03 May 2024 12:15:49 -0700
From: "David E. Box" <david.e.box@...ux.intel.com>
To: Jian-Hong Pan <jhp@...lessos.org>, Nirmal Patel
	 <nirmal.patel@...ux.intel.com>, Bjorn Helgaas <helgaas@...nel.org>
Cc: Johan Hovold <johan@...nel.org>, Ilpo Järvinen
 <ilpo.jarvinen@...ux.intel.com>, Kuppuswamy Sathyanarayanan
 <sathyanarayanan.kuppuswamy@...ux.intel.com>, Mika Westerberg
 <mika.westerberg@...ux.intel.com>, Damien Le Moal <dlemoal@...nel.org>, 
 Jonathan Derrick <jonathan.derrick@...ux.dev>, linux-pci@...r.kernel.org,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH v5 4/4] PCI/ASPM: Fix L1.2 parameters when enable link
 state

+Nirmal,

Thanks Jian-Hong. This is a good find.

On Fri, 2024-05-03 at 17:45 +0800, Jian-Hong Pan wrote:
> David E. Box <david.e.box@...ux.intel.com> 於 2024年5月1日 週三 上午2:26寫道:
> > 
> > On Tue, 2024-04-30 at 15:46 +0800, Jian-Hong Pan wrote:
> > > David E. Box <david.e.box@...ux.intel.com> 於 2024年4月27日 週六 上午8:03寫道:
> > > > 
> > > > Hi Jian-Hong,
> > > > 
> > > > On Wed, 2024-04-24 at 19:02 +0800, Jian-Hong Pan wrote:
> > > > > Currently, when enable link's L1.2 features with
> > > > > __pci_enable_link_state(),
> > > > > it configs the link directly without ensuring related L1.2 parameters,
> > > > > such
> > > > > as T_POWER_ON, Common_Mode_Restore_Time, and LTR_L1.2_THRESHOLD have
> > > > > been
> > > > > programmed.
> > > > > 
> > > > > This leads the link's L1.2 between PCIe Root Port and child device
> > > > > gets
> > > > > wrong configs when a caller tries to enabled it.
> > > > > 
> > > > > Here is a failed example on ASUS B1400CEAE with enabled VMD:
> > > > > 
> > > > > 10000:e0:06.0 PCI bridge: Intel Corporation 11th Gen Core Processor
> > > > > PCIe
> > > > > Controller (rev 01) (prog-if 00 [Normal decode])
> > > > >     ...
> > > > >     Capabilities: [200 v1] L1 PM Substates
> > > > >         L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+
> > > > > L1_PM_Substates+
> > > > >                   PortCommonModeRestoreTime=45us PortTPowerOnTime=50us
> > > > >         L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > > > >                    T_CommonMode=45us LTR1.2_Threshold=101376ns
> > > > >         L1SubCtl2: T_PwrOn=50us
> > > > > 
> > > > > 10000:e1:00.0 Non-Volatile memory controller: Sandisk Corp WD Blue
> > > > > SN550
> > > > > NVMe
> > > > > SSD (rev 01) (prog-if 02 [NVM Express])
> > > > >     ...
> > > > >     Capabilities: [900 v1] L1 PM Substates
> > > > >         L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > > > > L1_PM_Substates+
> > > > >                   PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
> > > > >         L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > > > >                    T_CommonMode=0us LTR1.2_Threshold=0ns
> > > > >         L1SubCtl2: T_PwrOn=10us
> > > > > 
> > > > > According to "PCIe r6.0, sec 5.5.4", before enabling ASPM L1.2 on the
> > > > > PCIe
> > > > > Root Port and the child NVMe, they should be programmed with the same
> > > > > LTR1.2_Threshold value. However, they have different values in this
> > > > > case.
> > > > > 
> > > > > Invoke aspm_calc_l12_info() to program the L1.2 parameters properly
> > > > > before
> > > > > enable L1.2 bits of L1 PM Substates Control Register in
> > > > > __pci_enable_link_state().
> > > > > 
> > > > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=218394
> > > > > Signed-off-by: Jian-Hong Pan <jhp@...lessos.org>
> > > > > ---
> > > > > v2:
> > > > > - Prepare the PCIe LTR parameters before enable L1 Substates
> > > > > 
> > > > > v3:
> > > > > - Only enable supported features for the L1 Substates part
> > > > > 
> > > > > v4:
> > > > > - Focus on fixing L1.2 parameters, instead of re-initializing whole
> > > > > L1SS
> > > > > 
> > > > > v5:
> > > > > - Fix typo and commit message
> > > > > - Split introducing aspm_get_l1ss_cap() to "PCI/ASPM: Introduce
> > > > >   aspm_get_l1ss_cap()"
> > > > > 
> > > > >  drivers/pci/pcie/aspm.c | 12 ++++++++++++
> > > > >  1 file changed, 12 insertions(+)
> > > > > 
> > > > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > > > > index c55ac11faa73..553327dee991 100644
> > > > > --- a/drivers/pci/pcie/aspm.c
> > > > > +++ b/drivers/pci/pcie/aspm.c
> > > > > @@ -1402,6 +1402,8 @@ EXPORT_SYMBOL(pci_disable_link_state);
> > > > >  static int __pci_enable_link_state(struct pci_dev *pdev, int state,
> > > > > bool
> > > > > locked)
> > > > >  {
> > > > >         struct pcie_link_state *link = pcie_aspm_get_link(pdev);
> > > > > +       struct pci_dev *child = link->downstream, *parent = link-
> > > > > >pdev;
> > > > > +       u32 parent_l1ss_cap, child_l1ss_cap;
> > > > > 
> > > > >         if (!link)
> > > > >                 return -EINVAL;
> > > > > @@ -1433,6 +1435,16 @@ static int __pci_enable_link_state(struct
> > > > > pci_dev
> > > > > *pdev, int state, bool locked)
> > > > >                 link->aspm_default |= ASPM_STATE_L1_1_PCIPM |
> > > > > ASPM_STATE_L1;
> > > > >         if (state & PCIE_LINK_STATE_L1_2_PCIPM)
> > > > >                 link->aspm_default |= ASPM_STATE_L1_2_PCIPM |
> > > > > ASPM_STATE_L1;
> > > > > +       /*
> > > > > +        * Ensure L1.2 parameters: Common_Mode_Restore_Times,
> > > > > T_POWER_ON
> > > > > and
> > > > > +        * LTR_L1.2_THRESHOLD are programmed properly before enable
> > > > > bits
> > > > > for
> > > > > +        * L1.2, per PCIe r6.0, sec 5.5.4.
> > > > > +        */
> > > > > +       if (state & link->aspm_capable & ASPM_STATE_L1_2_MASK) {
> > > > 
> > > > This is still mixing PCIE_LINK_STATE flags with ASPM_STATE flags.
> > > 
> > > Thanks for your review, but I notice some description in PCIe spec,
> > > 5.5.4 L1 PM Substates Configuration:
> > > "Prior to setting either or both of the enable bits for L1.2, the
> > > values for TPOWER_ON, Common_Mode_Restore_Time, and, if the ASPM L1.2
> > > Enable bit is to be Set, the LTR_L1.2_THRESHOLD (both Value and Scale
> > > fields) must be programmed." => I think this includes both "ASPM L12
> > > Enable" and "PCI-PM L1.2 Enable" bits.
> > 
> > That's fine. While the spec clearly calls out the ASPM L1.2 Enable bit here,
> > I
> > see no harm in including PCI-PM L1.2 in that check. This is what the code
> > already does in aspm_l1ss_init().
> > 
> > The issue is the mixed used of two different types of flags that don't have
> > the
> > same meaning. 'state' contains PCIE_LINK_STATE flags which are part of the
> > caller API to the pci_<enabled/disable>_link_state() functions. The
> > ASPM_STATE
> > flags are used internally to aspm.c to track all states and their meaningful
> > combinations such as ASPM_STATE_L1_2_MASK which includes ASPM L1.2 and PCI-
> > PM
> > L1.2. You should not do bit operations between them.
> > 
> > Also, you should not require that the timings be calculated only if L1_2 is
> > enabled. You should calculate the timings as long as it's capable. This is
> > also
> > what aspm_l1ss_init() does.
> > 
> > The confusion might be over the fact that you are having
> > __pci_enable_link_state() call aspm_calc_l12_info(). This should have been
> > handled during initialization of the link in aspm_l1ss_init() and I'm not
> > sure
> > why it didn't. Maybe it's because, for VMD, ASPM default state would have
> > started out all disabled and this somehow led to aspm_l1ss_init() not
> > getting
> > called. But looking through the code I don't see it. It would be great if
> > you
> > can confirm why they weren't calculated before.
> 
> I debug it again.  If I delete the pci_reset_bus() in vmd controller like:
> 
> diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> index 87b7856f375a..39bfda4350bf 100644
> --- a/drivers/pci/controller/vmd.c
> +++ b/drivers/pci/controller/vmd.c
> @@ -930,25 +930,6 @@ static int vmd_enable_domain(struct vmd_dev *vmd,
> unsigned long features)
>         pci_scan_child_bus(vmd->bus);
>         vmd_domain_reset(vmd);
> 
> -       /* When Intel VMD is enabled, the OS does not discover the Root Ports
> -        * owned by Intel VMD within the MMCFG space. pci_reset_bus() applies
> -        * a reset to the parent of the PCI device supplied as argument. This
> -        * is why we pass a child device, so the reset can be triggered at
> -        * the Intel bridge level and propagated to all the children in the
> -        * hierarchy.
> -        */
> -       list_for_each_entry(child, &vmd->bus->children, node) {
> -               if (!list_empty(&child->devices)) {
> -                       dev = list_first_entry(&child->devices,
> -                                              struct pci_dev, bus_list);
> -                       ret = pci_reset_bus(dev);

Hi Nirmal. It's not clear to me from the comment why there's a need to do a bus
reset. It looks like it is causing misconfiguration of the ASPM L1.2 timings
which would have been done above in pci_scan_child_bus(). Jian-Hong discovered
that without the above reset code, the timings are correct.

This patch recalculates the timings during the call to pci_enable_link_state()
which is called during pci_bus_walk() below. Originally I thought the
recalculation might have been needed by all callers to pci_enabled_link_state()
since it changes the default BIOS configuration. But it looks like the reset is
the cause and only the VMD driver would need the recalculation as a result. I
don't see qcom doing a reset.

Jian-Hong, given this (and assuming the reset is needed) I would not call
aspm_calc_l12_info() from pci_enable_link_state() but instead try redoing the
whole ASPM initialization right after the resets are done, maybe by calling
pci_scan_child_bus() again. What do you think Bjorn?

David

> -                       if (ret)
> -                               pci_warn(dev, "can't reset device: %d\n",
> ret);
> -
> -                       break;
> -               }
> -       }
> -
>         pci_assign_unassigned_bus_resources(vmd->bus);
> 
>         pci_walk_bus(vmd->bus, vmd_pm_enable_quirk, &features);
> 
> Although PCI-PM_L1.2 is disabled, both PCI bridge and the NVMe's
> LTR1.2_Threshold are configured as 101376ns:
> 
> 10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core
> Processor PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal
> decode])
> ...
>   Capabilities: [200 v1] L1 PM Substates
>   L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
>     PortCommonModeRestoreTime=45us PortTPowerOnTime=50us
>   L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
>      T_CommonMode=45us LTR1.2_Threshold=101376ns
>   L1SubCtl2: T_PwrOn=50us
> 
> 10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD
> Blue SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express])
> ...
>   Capabilities: [900 v1] L1 PM Substates
>   L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
>     PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
>   L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
>      T_CommonMode=0us LTR1.2_Threshold=101376ns
>   L1SubCtl2: T_PwrOn=50us
> 
> Then, I apply the patch "PCI: vmd: Set PCI devices to D0 before enable
> PCI PM's L1 substates".  Both PCI bridge and the NVMe's PCI-PM_L1.2 is
> enabled and LTR1.2_Threshold is configured as 101376ns.
> 
> 10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core
> Processor PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal
> decode])
> ...
>   Capabilities: [200 v1] L1 PM Substates
>   L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
>     PortCommonModeRestoreTime=45us PortTPowerOnTime=50us
>   L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
>      T_CommonMode=45us LTR1.2_Threshold=101376ns
>   L1SubCtl2: T_PwrOn=50us
> 
> 10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD
> Blue SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express])
> ...
>   Capabilities: [900 v1] L1 PM Substates
>   L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
>     PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
>   L1SubCtl1: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
>      T_CommonMode=0us LTR1.2_Threshold=101376ns
>   L1SubCtl2: T_PwrOn=50us
> 
> I do not know VMD very much.  However, from the test result, it looks
> like LTR1.2_Threshold has been configured properly originally.  But,
> LTR1.2_Threshold is misconfigured by 'pci_reset_bus()'.
> 
> Jian-Hong Pan
> 
> 
> 
> > > 
> > > Jain-Hong Pan
> > > 
> > > > 'state' should not even matter.
> > > > The timings should always be calculated and programmed as long
> > > > as L1_2 is capable. That way the timings are ready even if L1_2 isn't
> > > > being
> > > > enabled now (in case the user enables it later).
> > > > 
> > > > David
> > > > 
> > > > > +               parent_l1ss_cap = aspm_get_l1ss_cap(parent);
> > > > > +               child_l1ss_cap = aspm_get_l1ss_cap(child);
> > > > > +               aspm_calc_l12_info(link, parent_l1ss_cap,
> > > > > child_l1ss_cap);
> > > > > +       }
> > > > >         pcie_config_aspm_link(link, policy_to_aspm_state(link));
> > > > > 
> > > > >         link->clkpm_default = (state & PCIE_LINK_STATE_CLKPM) ? 1 : 0;
> > > > 
> > 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ