lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <19452e8c-4e29-4703-afc9-3257a2d1183d@linux.intel.com>
Date: Fri, 11 Jul 2025 17:49:03 +0300 (EEST)
From: Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
To: David Box <david.e.box@...ux.intel.com>
cc: "Rafael J. Wysocki" <rafael@...nel.org>, bhelgaas@...gle.com, 
    andrea.righi@...onical.com, vicamo.yang@...onical.com, kenny@...ix.com, 
    nirmal.patel@...ux.intel.com, linux-pm@...r.kernel.org, 
    linux-pci@...r.kernel.org, ilpo.jarvinen@...ux.intel.com, 
    LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] PCI/ASPM: Allow ASPM enablement for devices behind Intel
 VMD

On Fri, 11 Jul 2025, David Box wrote:

> Hey Rafael,
> 
> On Thu, Jul 10, 2025 at 09:53:18PM +0200, Rafael J. Wysocki wrote:
> > On Thu, Jul 10, 2025 at 3:16 AM David E. Box
> > <david.e.box@...ux.intel.com> wrote:
> > >
> > > Devices behind Intel's Volume Management Device (VMD) controller reside on
> > > a synthetic PCI hierarchy that is intentionally hidden from ACPI and
> > > firmware. As such, BIOS does not configure ASPM for these devices, and the
> > > responsibility for link power management, including ASPM and LTR, falls
> > > entirely to the VMD driver.
> > >
> > > However, the current PCIe ASPM code prevents ASPM configuration when the
> > > ACPI_FADT_NO_ASPM flag is set, disallowing OS management. This leaves ASPM
> > > permanently disabled for these devices, contrary to the platform's design
> > > intent.
> > >
> > > Introduce a callback mechanism that allows the VMD driver to enable ASPM
> > > for its domain, bypassing the global ACPI_FADT_NO_ASPM restriction that is
> > > not applicable in this context. This ensures that devices behind VMD can
> > > benefit from ASPM savings as originally intended.
> > >
> > > Link: https://lore.kernel.org/linux-pm/0b166ece-eeec-ba5d-2212-50d995611cef@panix.com
> > > Signed-off-by: David E. Box <david.e.box@...ux.intel.com>
> > 
> > First of all, thanks for doing this work, much appreciated!
> > 
> > > ---
> > >  drivers/pci/controller/vmd.c | 28 ++++++++++++++++++++++++++--
> > >  drivers/pci/pci.h            |  8 ++++++++
> > >  drivers/pci/pcie/aspm.c      | 36 +++++++++++++++++++++++++++++++++++-
> > >  3 files changed, 69 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> > > index 8df064b62a2f..e685586dc18b 100644
> > > --- a/drivers/pci/controller/vmd.c
> > > +++ b/drivers/pci/controller/vmd.c
> > > @@ -21,6 +21,8 @@
> > >
> > >  #include <asm/irqdomain.h>
> > >
> > > +#include "../pci.h"
> > > +
> > >  #define VMD_CFGBAR     0
> > >  #define VMD_MEMBAR1    2
> > >  #define VMD_MEMBAR2    4
> > > @@ -730,7 +732,7 @@ static void vmd_copy_host_bridge_flags(struct pci_host_bridge *root_bridge,
> > >  }
> > >
> > >  /*
> > > - * Enable ASPM and LTR settings on devices that aren't configured by BIOS.
> > > + * Enable LTR settings on devices that aren't configured by BIOS.
> > >   */
> > >  static int vmd_pm_enable_quirk(struct pci_dev *pdev, void *userdata)
> > >  {
> > > @@ -770,10 +772,27 @@ static int vmd_pm_enable_quirk(struct pci_dev *pdev, void *userdata)
> > >          * PCIe r6.0, sec 5.5.4.
> > >          */
> > >         pci_set_power_state_locked(pdev, PCI_D0);
> > > -       pci_enable_link_state_locked(pdev, PCIE_LINK_STATE_ALL);
> > 
> > Do I think correctly that this doesn't work because of the
> > aspm_disabled check in __pci_enable_link_state()?
> 
> Yes.
> 
> > 
> > >         return 0;
> > >  }
> > >
> > > +static long vmd_get_link_state(struct pci_dev *pdev, void *data)
> > > +{
> > > +       struct pci_bus *vmd_bus = data;
> > > +       struct pci_bus *bus = pdev->bus;
> > > +
> > > +       while (bus) {
> > > +               if (bus == vmd_bus)
> > > +                       return PCIE_LINK_STATE_ALL;
> > > +
> > > +               if (!bus->self)
> > > +                       break;
> > > +
> > > +               bus = bus->self->bus;
> > > +       }
> > 
> > If I'm not mistaken, it would be sufficient to do a check like
> > 
> >     if (pci_dev->bus->ops == &vmd_ops)
> >             return PCIE_LINK_STATE_ALL;
> > 
> > instead of the above, or if not then why not?
> 
> As a replacement in the loop, that does look sufficient. I'm not sure if
> you're also suggesting replacing the loop itself.
> 
> > 
> > > +
> > > +       return -ENODEV;
> > > +}
> > > +
> > >  static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
> > >  {
> > >         struct pci_sysdata *sd = &vmd->sysdata;
> > > @@ -785,6 +804,7 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
> > >         resource_size_t membar2_offset = 0x2000;
> > >         struct pci_bus *child;
> > >         struct pci_dev *dev;
> > > +       struct pcie_aspm_vmd_link_state vmd_link_state;
> > >         int ret;
> > >
> > >         /*
> > > @@ -911,6 +931,10 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
> > >                 return -ENODEV;
> > >         }
> > >
> > > +       vmd_link_state.cb = vmd_get_link_state;
> > > +       vmd_link_state.data = vmd->bus;
> > > +       pci_register_vmd_link_state_cb(&vmd_link_state);
> > > +
> > >         vmd_copy_host_bridge_flags(pci_find_host_bridge(vmd->dev->bus),
> > >                                    to_pci_host_bridge(vmd->bus->bridge));
> > >
> > > diff --git a/drivers/pci/pci.h b/drivers/pci/pci.h
> > > index 12215ee72afb..dcf7d39c660f 100644
> > > --- a/drivers/pci/pci.h
> > > +++ b/drivers/pci/pci.h
> > > @@ -821,6 +821,12 @@ void pci_configure_aspm_l1ss(struct pci_dev *dev);
> > >  void pci_save_aspm_l1ss_state(struct pci_dev *dev);
> > >  void pci_restore_aspm_l1ss_state(struct pci_dev *dev);
> > >
> > > +
> > > +struct pcie_aspm_vmd_link_state {
> > > +       long (*cb)(struct pci_dev *pdev, void *data);
> > > +       void *data;
> > > +};
> > > +
> > >  #ifdef CONFIG_PCIEASPM
> > >  void pcie_aspm_init_link_state(struct pci_dev *pdev);
> > >  void pcie_aspm_exit_link_state(struct pci_dev *pdev);
> > > @@ -828,6 +834,7 @@ void pcie_aspm_pm_state_change(struct pci_dev *pdev, bool locked);
> > >  void pcie_aspm_powersave_config_link(struct pci_dev *pdev);
> > >  void pci_configure_ltr(struct pci_dev *pdev);
> > >  void pci_bridge_reconfigure_ltr(struct pci_dev *pdev);
> > > +void pci_register_vmd_link_state_cb(struct pcie_aspm_vmd_link_state *state);
> > >  #else
> > >  static inline void pcie_aspm_init_link_state(struct pci_dev *pdev) { }
> > >  static inline void pcie_aspm_exit_link_state(struct pci_dev *pdev) { }
> > > @@ -835,6 +842,7 @@ static inline void pcie_aspm_pm_state_change(struct pci_dev *pdev, bool locked)
> > >  static inline void pcie_aspm_powersave_config_link(struct pci_dev *pdev) { }
> > >  static inline void pci_configure_ltr(struct pci_dev *pdev) { }
> > >  static inline void pci_bridge_reconfigure_ltr(struct pci_dev *pdev) { }
> > > +void pci_register_vmd_link_state_cb(struct pcie_aspm_vmd_link_state *state) { }
> > >  #endif
> > >
> > >  #ifdef CONFIG_PCIE_ECRC
> > > diff --git a/drivers/pci/pcie/aspm.c b/drivers/pci/pcie/aspm.c
> > > index 29fcb0689a91..c609d3c309be 100644
> > > --- a/drivers/pci/pcie/aspm.c
> > > +++ b/drivers/pci/pcie/aspm.c
> > > @@ -320,6 +320,27 @@ static int policy_to_clkpm_state(struct pcie_link_state *link)
> > >         return 0;
> > >  }
> > >
> > > +static struct pcie_aspm_vmd_link_state vmd_state;
> > > +
> > > +void pci_register_vmd_link_state_cb(struct pcie_aspm_vmd_link_state *state)
> > > +{
> > > +       mutex_lock(&aspm_lock);
> > > +       vmd_state.cb = state->cb;
> > > +       vmd_state.data = state->data;
> > > +       mutex_unlock(&aspm_lock);
> > > +}
> > > +EXPORT_SYMBOL_GPL(pci_register_vmd_link_state_cb);
> > > +
> > > +static long pci_get_vmd_link_state(struct pci_dev *pdev)
> > > +{
> > > +       int state = -ENODEV;
> > > +
> > > +       if (vmd_state.cb)
> > > +               state = vmd_state.cb(pdev, vmd_state.data);
> > > +
> > > +       return state;
> > > +}
> > > +
> > >  static void pci_update_aspm_saved_state(struct pci_dev *dev)
> > >  {
> > >         struct pci_cap_saved_state *save_state;
> > > @@ -794,6 +815,7 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
> > >         u32 parent_lnkcap, child_lnkcap;
> > >         u16 parent_lnkctl, child_lnkctl;
> > >         struct pci_bus *linkbus = parent->subordinate;
> > > +       int vmd_aspm_default;
> > >
> > >         if (blacklist) {
> > >                 /* Set enabled/disable so that we will disable ASPM later */
> > > @@ -865,8 +887,20 @@ static void pcie_aspm_cap_init(struct pcie_link_state *link, int blacklist)
> > >                 pcie_capability_write_word(child, PCI_EXP_LNKCTL, child_lnkctl);
> > >         }
> > >
> > > +       /*
> > > +        * Devices behind Intel VMD operate on a synthetic PCI bus that BIOS
> > > +        * and ACPI do not enumerate or configure. ASPM for these devices must
> > > +        * be managed by the VMD driver itself, independent of global ACPI ASPM
> > > +        * constraints. This callback mechanism allows selective ASPM
> > > +        * enablement for such domains.
> > > +        */
> > > +       vmd_aspm_default = pci_get_vmd_link_state(parent);
> > > +
> > >         /* Save default state */
> > > -       link->aspm_default = link->aspm_enabled;
> > > +       if (vmd_aspm_default < 0)
> > > +               link->aspm_default = link->aspm_enabled;
> > > +       else
> > > +               link->aspm_default = vmd_aspm_default;
> > 
> > Well, this is not nice.
> > 
> > First off, it adds VMD-specific stuff to otherwise generic ASPM code.
> > Second, it doesn't actually do anything about the aspm_disabled checks
> > all over the place, so they will still trigger even though ASPM will
> > be effectively enabled for devices on the VMD bus.
> 
> I agree that it's not nice to be mixing VMD specific code here. It's a
> working bad solution to come up with a working good solution :)
> 
> The reason this patch works is that the aspm_disabled checks only gate
> OS-controlled ASPM configuration. They don't affect the BIOS default
> values, which are what we're setting in link->aspm_default. The ASPM
> code uses link->aspm_default as the fallback when ASPM is globally
> disabled, which is exactly what we want for devices behind VMD where the
> driver, not BIOS, effectively is the platform provider of the defaults.

Oh, this was a big news to me.

So what you're saying is that if aspm_disabled is set, ->aspm_disable 
cannot be set and thus pcie_config_aspm_link() that is not gated by 
aspm_disabled can alter ASPM state despite OS not having ASPM control???

...That's really odd logic which the ASPM driver seems to be full of.

-- 
 i.

> I put it here using a callback value because this where the BIOS
> defaults are saved for each device.
> 
> > 
> > >
> > >         /* Setup initial capable state. Will be updated later */
> > >         link->aspm_capable = link->aspm_support;
> > >
> > > base-commit: d0b3b7b22dfa1f4b515fd3a295b3fd958f9e81af
> > > --
> > 
> > It appears to me that the underlying problem is that aspm_disabled is
> > global and it is set during PCI root bus creation in
> > acpi_pci_root_add().  Thus it affects all of the PCI buses even though
> > the BIOS says that it wants to control ASPM for this particular PCIe
> > hierarchy. If there were another PCI root enumerated by ACPI where
> > the OS would be allowed to control ASPM, it would not work just like
> > the VMD case.
> 
> It would work in the non-VMD case because it has the BIOS default to
> use. VMD does not.
> 
> > 
> > To me, this suggests an approach based on moving the "ASPM disabled
> > because the BIOS wants to control it" setting to pci_bus_flags_t and
> > setting it on a per-hierarchy basis.  Since the VMD bus is a separate
> > PCIe hierarchy from the kernel perspective, it will not have this flag
> > set and the OS should be able to configure ASPM for devices on that
> > bus.
> > 
> > Do I think correctly or am I overlooking something here?
> 
> It’s definitely along those lines. The issue is really caused by two
> things:
> 
>     1. aspm_disabled global prevents OS control of ASPM
>     2. For VMD, there’s no BIOS provided link->aspm_default
> 
> The solution I proposed addresses the problem by having a mechanism for
> the VMD driver to supply its own defaults in place of the missing BIOS
> configuration.
> 
> Using pci_bus_flags_t could work as well, but would rather just say that
> aspm_disabled doesn't apply to the bus hierarchy, allowing the current
> pci_enable_link_state_locked() solution to work all the time. Either
> way, I'm definitely looking for a cleaner solution. The above hopefully
> clarifies the situation.
> 
> David
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ