[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAPpJ_ecuWkFZoriVyQxXV3dn-pAxDus-8vVwFMdhjpS5H2cfpw@mail.gmail.com>
Date: Wed, 7 Aug 2024 12:23:06 +0800
From: Jian-Hong Pan <jhp@...lessos.org>
To: Nirmal Patel <nirmal.patel@...ux.intel.com>
Cc: Bjorn Helgaas <helgaas@...nel.org>, Johan Hovold <johan@...nel.org>,
David Box <david.e.box@...ux.intel.com>,
Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>,
Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@...ux.intel.com>,
Mika Westerberg <mika.westerberg@...ux.intel.com>, Damien Le Moal <dlemoal@...nel.org>,
Jonathan Derrick <jonathan.derrick@...ux.dev>,
Paul M Stillwell Jr <paul.m.stillwell.jr@...el.com>, linux-pci@...r.kernel.org,
linux-kernel@...r.kernel.org, linux@...lessos.org
Subject: Re: [PATCH v8 4/4] PCI/ASPM: Fix L1.2 parameters when enable link state
Nirmal Patel <nirmal.patel@...ux.intel.com> 於 2024年8月6日 週二 上午2:25寫道:
>
> On Fri, 2 Aug 2024 16:24:18 +0800
> Jian-Hong Pan <jhp@...lessos.org> wrote:
>
> > Jian-Hong Pan <jhp@...lessos.org> 於 2024年7月19日 週五 下午4:04寫道:
> > >
> > > Currently, when enable link's L1.2 features with
> > > __pci_enable_link_state(), it configs the link directly without
> > > ensuring related L1.2 parameters, such as T_POWER_ON,
> > > Common_Mode_Restore_Time, and LTR_L1.2_THRESHOLD have been
> > > programmed.
> > >
> > > This leads the link's L1.2 between PCIe Root Port and child device
> > > gets wrong configs when a caller tries to enabled it.
> > >
> > > Here is a failed example on ASUS B1400CEAE with enabled VMD:
> > >
> > > 10000:e0:06.0 PCI bridge: Intel Corporation 11th Gen Core Processor
> > > PCIe Controller (rev 01) (prog-if 00 [Normal decode]) ...
> > > Capabilities: [200 v1] L1 PM Substates
> > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+
> > > L1_PM_Substates+ PortCommonModeRestoreTime=45us
> > > PortTPowerOnTime=50us L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1-
> > > ASPM_L1.2+ ASPM_L1.1- T_CommonMode=45us LTR1.2_Threshold=101376ns
> > > L1SubCtl2: T_PwrOn=50us
> > >
> > > 10000:e1:00.0 Non-Volatile memory controller: Sandisk Corp WD Blue
> > > SN550 NVMe SSD (rev 01) (prog-if 02 [NVM Express]) ...
> > > Capabilities: [900 v1] L1 PM Substates
> > > L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1-
> > > L1_PM_Substates+ PortCommonModeRestoreTime=32us
> > > PortTPowerOnTime=10us L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1-
> > > ASPM_L1.2+ ASPM_L1.1- T_CommonMode=0us LTR1.2_Threshold=0ns
> > > L1SubCtl2: T_PwrOn=10us
> > >
> > > According to "PCIe r6.0, sec 5.5.4", before enabling ASPM L1.2 on
> > > the PCIe Root Port and the child NVMe, they should be programmed
> > > with the same LTR1.2_Threshold value. However, they have different
> > > values in this case.
> > >
> > > Invoke aspm_calc_l12_info() to program the L1.2 parameters properly
> > > before enable L1.2 bits of L1 PM Substates Control Register in
> > > __pci_enable_link_state().
> > >
> > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=218394
> > > Signed-off-by: Jian-Hong Pan <jhp@...lessos.org>
> > > ---
> > > v2:
> > > - Prepare the PCIe LTR parameters before enable L1 Substates
> > >
> > > v3:
> > > - Only enable supported features for the L1 Substates part
> > >
> > > v4:
> > > - Focus on fixing L1.2 parameters, instead of re-initializing whole
> > > L1SS
> > >
> > > v5:
> > > - Fix typo and commit message
> > > - Split introducing aspm_get_l1ss_cap() to "PCI/ASPM: Introduce
> > > aspm_get_l1ss_cap()"
> > >
> > > v6:
> > > - Skipped
> > >
> > > v7:
> > > - Pick back and rebase on the new version kernel
> > > - Drop the link state flag check. And, always config link state's
> > > timing parameters
> > >
> > > v8:
> > > - Because pcie_aspm_get_link() might return the link as NULL, move
> > > getting the link's parent and child devices after check the link
> > > is not NULL. This avoids NULL memory access.
> > >
> > > drivers/pci/pcie/aspm.c | 15 +++++++++++++++
> > > 1 file changed, 15 insertions(+)
> > >
> > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > > index 5db1044c9895..55ff1d26fcea 100644
> > > --- a/drivers/pci/pcie/aspm.c
> > > +++ b/drivers/pci/pcie/aspm.c
> > > @@ -1411,9 +1411,15 @@ EXPORT_SYMBOL(pci_disable_link_state);
> > > static int __pci_enable_link_state(struct pci_dev *pdev, int
> > > state, bool locked) {
> > > struct pcie_link_state *link = pcie_aspm_get_link(pdev);
> > > + u32 parent_l1ss_cap, child_l1ss_cap;
> > > + struct pci_dev *parent, *child;
> > >
> > > if (!link)
> > > return -EINVAL;
> > > +
> > > + parent = link->pdev;
> > > + child = link->downstream;
> > > +
> > > /*
> > > * A driver requested that ASPM be enabled on this device,
> > > but
> > > * if we don't have permission to manage ASPM (e.g., on ACPI
> > > @@ -1428,6 +1434,15 @@ static int __pci_enable_link_state(struct
> > > pci_dev *pdev, int state, bool locked) if (!locked)
> > > down_read(&pci_bus_sem);
> > > mutex_lock(&aspm_lock);
> > > + /*
> > > + * Ensure L1.2 parameters: Common_Mode_Restore_Times,
> > > T_POWER_ON and
> > > + * LTR_L1.2_THRESHOLD are programmed properly before enable
> > > bits for
> > > + * L1.2, per PCIe r6.0, sec 5.5.4.
> > > + */
> > > + parent_l1ss_cap = aspm_get_l1ss_cap(parent);
> > > + child_l1ss_cap = aspm_get_l1ss_cap(child);
> > > + aspm_calc_l12_info(link, parent_l1ss_cap, child_l1ss_cap);
> > > +
> > > link->aspm_default = pci_calc_aspm_enable_mask(state);
> > > pcie_config_aspm_link(link, policy_to_aspm_state(link));
> > >
> > > --
> > > 2.45.2
> > >
> >
> > Hi Nirmal and Paul,
> >
> > It will be great to have your review here.
> >
> > I had tried to "set the threshold value in vmd_pm_enable_quirk()"
> > directly as Paul said [1]. However, it still needs to get the PCIe
> > link from the PCIe device to set the threshold value.
> > And, pci_enable_link_state_locked() gets the link. Then, it will be
> > great to calculate and programm L1 sub-states' parameters properly
> > before configuring the link's ASPM there.
> >
> > [1]:
> > https://lore.kernel.org/linux-kernel/20240624081108.10143-2-jhp@endlessos.org/T/#mc467498213fe1a6116985c04d714dae378976124
> >
> > Jian-Hong Pan
>
> Hi Jian-Hong Pan,
>
> I am not an LTR, ASPM expert, but this part looks good to me.
>
> Can you explain why you decided to move pci_enable_link_state_locked()
> call down to out_state_change in vmd.c?
The idea is setting all LTR related parameters before enabling the ASPM feature.
> Will it cause any issue if pci_find_ext_capability returns 0?
If pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_LTR) in
vmd_pm_enable_quirk() returns 0, then the device is not a PCIe device.
Then, it goes to:
...
pci_enable_link_state_locked()
__pci_enable_link_state()
__pci_enable_link_state() uses pcie_aspm_get_link() to get the link
between the PCIe bridge and the PCIe device. And,
pcie_aspm_get_link() returns the link as a barrier. If
pcie_aspm_get_link() does not get the link, then the device is a PCIe
bridge, or not a PCIe device. Because the link is NULL,
__pci_enable_link_state() returns with -EINVAL directly and will not
configure/enable ASPM things.
Jian-Hong Pan
Powered by blists - more mailing lists