lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAPpJ_edFBRCk6LciQ+hVkq93NOneeCJHk3B1s_QQsZO20zSoQw@mail.gmail.com>
Date: Mon, 30 Sep 2024 17:30:51 +0800
From: Jian-Hong Pan <jhp@...lessos.org>
To: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
Cc: Bjorn Helgaas <helgaas@...nel.org>, Johan Hovold <johan@...nel.org>, 
	David Box <david.e.box@...ux.intel.com>, 
	Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@...ux.intel.com>, 
	Nirmal Patel <nirmal.patel@...ux.intel.com>, 
	Jonathan Derrick <jonathan.derrick@...ux.dev>, linux-pci@...r.kernel.org, 
	LKML <linux-kernel@...r.kernel.org>, linux@...lessos.org
Subject: Re: [PATCH v10 3/3] PCI/ASPM: Make pci_save_aspm_l1ss_state save both
 child and parent's L1SS configuration

Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com> 於 2024年9月27日 週五 下午7:17寫道:
>
> On Fri, 27 Sep 2024, Jian-Hong Pan wrote:
>
> > PCI devices' parameters on the VMD bus have been programmed properly
> > originally. But, cleared after pci_reset_bus() and have not been restored
> > correctly. This leads the link's L1.2 between PCIe Root Port and child
> > device gets wrong configs.
> >
> > Here is a failed example on ASUS B1400CEAE with enabled VMD. Both PCIe
> > bridge and NVMe device should have the same LTR1.2_Threshold value.
> > However, they are configured as different values in this case:
> >
> > 10000:e0:06.0 PCI bridge [0604]: Intel Corporation 11th Gen Core Processor PCIe Controller [8086:9a09] (rev 01) (prog-if 00 [Normal decode])
> >   ...
> >   Capabilities: [200 v1] L1 PM Substates
> >     L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
> >       PortCommonModeRestoreTime=45us PortTPowerOnTime=50us
> >     L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> >       T_CommonMode=0us LTR1.2_Threshold=0ns
> >     L1SubCtl2: T_PwrOn=0us
> >
> > 10000:e1:00.0 Non-Volatile memory controller [0108]: Sandisk Corp WD Blue SN550 NVMe SSD [15b7:5009] (rev 01) (prog-if 02 [NVM Express])
> >   ...
> >   Capabilities: [900 v1] L1 PM Substates
> >     L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
> >       PortCommonModeRestoreTime=32us PortTPowerOnTime=10us
> >     L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> >       T_CommonMode=0us LTR1.2_Threshold=101376ns
> >     L1SubCtl2: T_PwrOn=50us
> >
> > Here is VMD mapped PCI device tree:
> >
> > -+-[0000:00]-+-00.0  Intel Corporation Device 9a04
> >  | ...
> >  \-[10000:e0]-+-06.0-[e1]----00.0  Sandisk Corp WD Blue SN550 NVMe SSD
> >               \-17.0  Intel Corporation Tiger Lake-LP SATA Controller
> >
> > When pci_reset_bus() resets the bus [e1] of the NVMe, it only saves and
> > restores NVMe's state before and after reset.
>
> > Because bus [e1] has only one
> > device: 10000:e1:00.0 NVMe.
>
> This is incomplete sentence. And I don't understand why the number of
> devices is relevant. Perhaps just drop it?
>
> > The PCIe bridge is missed.
>
> Unclear what this refers to (I know what you mean but please write it
> so that other reading this 10 years from now will also get what is
> missed).
>
> > However, when it
> > restores the NVMe's state, it also restores the ASPM L1SS between the PCIe
> > bridge and the NVMe by pci_restore_aspm_l1ss_state().
>
> "it also restores ..." -> "ASPM code restores L1SS for both the parent
> bridge and the NVMe in pci_restore_aspm_l1ss_state()."
>
> > The NVMe's L1SS is
> > restored correctly. But, the PCIe bridge's L1SS is restored with the wrong
> > value 0x0 [1]. Because, the parent device (PCIe bridge)'s L1SS is not saved
>
> Join these sentences without . and drop the comma.
>
> I'd say "was not saved" because at that point it occurred clearly before
> the restore.
>
> > by pci_save_aspm_l1ss_state() before reset. That is why
> > pci_restore_aspm_l1ss_state() gets the parent device (PCIe bridge)'s saved
> > state capability data and restores L1SS with value 0.
>
> This last sentence seems duplicated.
>
> > So, if the PCI device has a parent, make pci_save_aspm_l1ss_state() save
> > the parent's L1SS configuration, too. This is symmetric on
> > pci_restore_aspm_l1ss_state().
> >
> > [1]: https://lore.kernel.org/linux-pci/CAPpJ_eexU0gCHMbXw_z924WxXw0+B6SdS4eG9oGpEX1wmnMLkQ@mail.gmail.com/
>
> Just put this into Link tag.
>
> > Link: https://bugzilla.kernel.org/show_bug.cgi?id=218394
>
> Closes: ?
>
> Fixes tag is also missing.
>
> Add:
>
> Suggested-by: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
>
> > Signed-off-by: Jian-Hong Pan <jhp@...lessos.org>
> > ---
> > v9:
> > - Drop the v8 fix about drivers/pci/pcie/aspm.c. Use this in VMD instead.
> >
> > v10:
> > - Drop the v9 fix about drivers/pci/controller/vmd.c
> > - Fix in PCIe ASPM to make it symmetric between pci_save_aspm_l1ss_state()
> >   and pci_restore_aspm_l1ss_state()
> >
> >  drivers/pci/pcie/aspm.c | 39 +++++++++++++++++++++++++++++++--------
> >  1 file changed, 31 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > index bd0a8a05647e..823aaf813978 100644
> > --- a/drivers/pci/pcie/aspm.c
> > +++ b/drivers/pci/pcie/aspm.c
> > @@ -81,24 +81,47 @@ void pci_configure_aspm_l1ss(struct pci_dev *pdev)
> >
> >  void pci_save_aspm_l1ss_state(struct pci_dev *pdev)
> >  {
> > -     struct pci_cap_saved_state *save_state;
> > -     u16 l1ss = pdev->l1ss;
> > +     struct pci_cap_saved_state *pl_save_state, *cl_save_state;
> > +     struct pci_dev *parent;
> >       u32 *cap;
> >
> >       /*
> >        * Save L1 substate configuration. The ASPM L0s/L1 configuration
> >        * in PCI_EXP_LNKCTL_ASPMC is saved by pci_save_pcie_state().
> >        */
> > -     if (!l1ss)
> > +     if (!pdev->l1ss)
> >               return;
> >
> > -     save_state = pci_find_saved_ext_cap(pdev, PCI_EXT_CAP_ID_L1SS);
> > -     if (!save_state)
> > +     cl_save_state = pci_find_saved_ext_cap(pdev, PCI_EXT_CAP_ID_L1SS);
> > +     if (!cl_save_state)
> >               return;
> >
> > -     cap = &save_state->cap.data[0];
> > -     pci_read_config_dword(pdev, l1ss + PCI_L1SS_CTL2, cap++);
> > -     pci_read_config_dword(pdev, l1ss + PCI_L1SS_CTL1, cap++);
> > +     cap = &cl_save_state->cap.data[0];
> > +     pci_read_config_dword(pdev, pdev->l1ss + PCI_L1SS_CTL2, cap++);
> > +     pci_read_config_dword(pdev, pdev->l1ss + PCI_L1SS_CTL1, cap++);
> > +
> > +     /*
> > +      * If here is a parent device and it has not saved state, save parent's
> > +      * L1 substate configuration, too. This is symmetric on
> > +      * pci_restore_aspm_l1ss_state().
> > +      */
> > +     if (!pdev->bus || !pdev->bus->self)
> > +             return;
> > +
> > +     parent = pdev->bus->self;
> > +     if (!parent->l1ss)
> > +             return;
> > +
> > +     pl_save_state = pci_find_saved_ext_cap(parent, PCI_EXT_CAP_ID_L1SS);
> > +     if (!pl_save_state)
> > +             return;
> > +
> > +     if (parent->state_saved)
> > +             return;
>
> This decision can be made before even reading the pl_saved_state.
>
> > +
> > +     cap = &pl_save_state->cap.data[0];
> > +     pci_read_config_dword(parent, parent->l1ss + PCI_L1SS_CTL2, cap++);
> > +     pci_read_config_dword(parent, parent->l1ss + PCI_L1SS_CTL1, cap++);
>
> This approach duplicates code and that seems unnecessary to me.
>
> Could you instead leave the current function (nearly?) as is and use it as
> a helper for a new pci_save_aspm_l1ss_state() which calls the helper first
> for the pdev and then for its parent (if needed). Or do I miss something
> why that is not possible?

I was thinking of invoking it recursively the first time. But, that
will go to parent's parent and so on ...
So, I flattened the code in pci_save_aspm_l1ss_state().

Having a helper doing what original pci_save_aspm_l1ss_state()'s work
sounds good!

Sent the v11 patch, including the modified commit message.

Jian-Hong Pan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ